Diagnostic, prognostic and therapeutic uses of long non-coding rnas for cancer and regenerative medicine

ABSTRACT

Long non-coding RNAs (lncRNAs) and methods of using them diagnostically and therapeutically for treatment of cancer, stem cell therapy, or regenerative medicine are disclosed. In particular, the invention relates to lncRNAs that that play roles in regulation of genes involved in cell proliferation, differentiation, and apoptosis. Such lncRNAs can be used as biomarkers to monitor cell proliferation and differentiation during cancer progression or tissue regeneration. One of the identified lncRNAs, referred to as PANDA (a P21-Associated NcRNA, DNA damage Activated), inhibits the expression of apoptotic genes normally activated by the transcription factor NF-YA. Inhibitors of PANDA sensitize cancerous cells to chemotherapy and can be used in combination with chemotherapeutic agents for treatment of cancer.

CROSS-REFERENCE TO RELATED APPLICATION

This application is a divisional application of U.S. application Ser.No. 13/470,233, filed May 11, 2012, which claims benefit under 35 U.S.C.§119(e) of provisional application 61/486,025, filed May 13, 2011, allof which applications are hereby incorporated herein by reference intheir entireties.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with Government support under contracts CA118750and AR054615 awarded by the National Institutes of Health. TheGovernment has certain rights in this invention.

TECHNICAL FIELD

The present invention pertains generally to long non-coding RNAs(lncRNAs) and methods of using them diagnostically and therapeutically.In particular, the invention relates to lncRNAs that that play roles inregulation of genes involved in cell proliferation, differentiation, andapoptosis, and their uses in treatment of cancer, stem cell therapy, orregenerative medicine.

BACKGROUND

Mammalian genomes are more pervasively transcribed than previouslyexpected (Bertone et al. (2004) Science 306:2242-2246; Carninci et al.(2005) Science 309:1559-1563; Calin et al. (2007) Cancer Cell 12:215-229; and Carninci (2008) Nat. Cell Biol. 10:1023-1024). In additionto the protein-coding regions of genes, much of the genome istranscribed as non-coding RNAs (ncRNAs). These non-coding genomictranscripts include many different types of small regulatory ncRNAs andlong ncRNAs (lncRNAs).

Included among the small non-coding RNAs are small interfering RNAs(siRNAs), microRNAs (miRNAs) and Piwi-associated RNAs (piRNAs), whichfunction in genome defense and post-transcriptional regulation (Matticket al. (2005) Hum. Mol. Genet. 14 Spec No 1, R121-R132; He et al. (2004)Nat. Rev. Genet. 5:522-531; and Hutvagner et al. (2008) Nat. Rev. Mol.Cell. Biol. 9:22-32). In addition, divergent transcription by RNApolymerase near transcriptional start sites (TSS) can result ingeneration of small ncRNAs, ranging from 20 to 200 nucleotides. ThesencRNAs have been variously named promoter-associated small RNAs (PASRs),transcription-initiation RNAs (tiRNAs) and TSS-associated RNAs(TSSa-RNAs) (Kapranov et al. (2007) Science 316:1484-1488; Seila et al.(2008) Science 322:1849-1851; Taft et al. (2009) Nat. Genet. 41:572-578;and Core (2008) Science 322:1845-1848). It remains uncertain, however,if these ncRNAs are functional or just represent byproducts of RNApolymerase infidelity (Ponjavic et al. (2007) Genome Res. 17:556-565;Struhl (2007) Nat. Struct. Mol. Biol. 14:103-105).

Long ncRNAs vary in length from several hundred bases to tens ofkilobases and may be located separate from protein coding genes (longintergenic ncRNAs or lincRNAs), or reside near or within protein codinggenes (Guttman et al. (2009) Nature 458:223-227; Katayama et al. (2005)Science 309:1564-1566). Recent evidence indicates that active enhancerelements may also be transcribed as lncRNAs (Kim et al. (2010) Nature465:182-187; De Santa et al. (2010) PLoS Biol. 8:e1000384).

Several lncRNAs have been implicated in transcriptional regulation. Forexample, in the CCND1 (encoding cyclin D1) promoter, an ncRNAtranscribed 2 kb upstream of CCND1 is induced by ionizing radiation andregulates transcription of CCND1 in cis by forming a ribonucleoproteinrepressor complex (Wang et al. (2008) Nature 454:126-130). This ncRNAbinds to and allosterically activates the RNA-binding protein TLS(translated in liposarcoma), which inhibits histone acetyltransferases,resulting in repression of CCND1 transcription. Another example is theantisense ncRNA CDKN2B-AS1 (also known as p15AS or ANRIL), whichoverlaps the p15 coding sequence. Expression of CDKN2B-AS is increasedin human leukemias and inversely correlated with p15 expression (Pasmantet al. (2007) Cancer Res. 67:3963-3969; Yu et al. (2008) Nature451:202-206). CDKN2B-AS1 can transcriptionally silence p15 directly aswell as through induction of heterochromatin formation. Manywell-studied lncRNAs, such as those involved in dosage compensation andimprinting, regulate gene expression in cis (Lee (2009) Genes Dev.23:1831-1842). Other lincRNAs, such as HOTAIR and linc-p21 regulate theactivity of distantly located genes in trans (Rinn et al. (2007) Cell129:1311-1323; Gupta et al. (2010) Nature 464:1071-1076; and Huarte etal. (2010) Cell 142:409-419).

A number of the identified lncRNAs are differentially expressed inassociation with cell proliferation, differentiation, or apoptosis andcould have important roles in regulating cell function (Huarte et al.(2010) Cell 142(3):409-419; Loewer et al. (2010) Nat. Genet.42(12):1113-1117; Ponjavic et al. (2009) PLoS Genet. 5(8):e1000617;Gupta et al. (2010) Nature 464(7291):1071-1076; and Mazar et al. (2010)Mol. Genet. Genomics 284:1-9). Such lncRNAs may potentially be usefuldiagnostically or therapeutically; however, the functions of only a fewof these lncRNAs have been studied in detail, and many more functionallncRNAs have yet to be discovered. Thus, there remains a need in the artfor identifying and characterizing lncRNAs that can be used indeveloping diagnostics and therapeutics.

SUMMARY

The invention relates to long non-coding RNAs (lncRNAs) and theirdiagnostic, prognostic, and therapeutic uses for cancer, stem celltherapy, and regenerative medicine. In particular, the invention relatesto lncRNAs that that play roles in regulation of genes involved in cellproliferation, differentiation, and apoptosis. Such lncRNAs can be usedas biomarkers to monitor cell proliferation and differentiation duringcancer progression, stem cell therapy, or tissue regeneration. One ofthe identified lncRNAs, referred to as PANDA (a P21-Associated NcRNA,DNA damage Activated), inhibits the expression of apoptotic genesnormally activated by the transcription factor NF-YA Inhibitors of PANDAsensitize cancerous cells to chemotherapy and can be used in combinationwith chemotherapeutic agents for treating cancer.

Biomarkers that can be used in the practice of the invention includelncRNAs, such as, but not limited to int:CDK6:143, dst:CDKN2A:43877,upst:CCNF:−1721, upst:CCNI:−6398, upst:CCNI:−6621, upst:CCNI:−6883,upst:CDKN1A:−4845, upst:CDK5R1:−4044, upst:CDK5R1:−4410,upst:CCNL2:−1391, upst:CCNL2:−2253, upst:CCNL2:−767, int:CDKN2D:1417,upst:CCNL2:−5540, int:CDKN1A:1420, int:CCNT1:602, upst:CCNL2:−3110,upst:CDK5R1:−5717, upst:CCNL2:−982, upst:CCNE2:−682, int:CDK5R1:183,upst:CDK5R1:−482, upst:CDK8:−798, upst:CDK9:−646, upst:CDK6:−1860,int:CDK6:1276, upst:CDK6:−533, upst:CDKN2C:−8037, upst:CCNK:−899,upst:CNNM3:−248, upst:CDKN1C:−4619, int:CDKN2A:6667, int:ARF:4530,upst:CDKN2B:−15913, upst:CDK6:−1679, upst:CDKN1A:−1210, int:CDKN2B:1926,dst:CDKN2A:39498, upst:CCNL1:−1968, upst:CCNL1:−2234, upst:CCNL1:−2383,upst:CCNL1:−2767, upst:CDK5R2:−6418, upst:CDK4:−7794, upst:CDKN1A:−5830,int:CDKN2C:159, upst:CCNYL2:−36, upst:CCNC:−6760, upst:CDKN2B:−2817,upst:CNNM3:−970, upst:CDK5R2:−6045, upst:CDKN1C:−2196, int:CCND1:874,int:CCND2:1205, upst:CDKN1C:−446, int:CCNG2:390, upst:CDK3:−4148,upst:CCNA2:−250, int:CDKL5:64, upst:CCND2: 3165, int:CCNK:210,int:CDKN1A:885, upst:CDK5R2:−9197, int:CNNM3:1459, upst:CCND1:−1659,int:CCNL2:463, upst:CCNE1:−1190, upst:CDK5R2:−8037, upst:CDKL3:−867,int:CCNG1:381, upst:CCND2:−2874, upst:CDKN2B:−130736, int:CCNI:1042,upst:CCND2:−4757, int:CDK9:352, int:CCND2:1689, int:CDKL5:1682,upst:CDK5R2:−4541, upst:CDK5:−7855, upst:CDK9:−1536, upst:CCND2:−1291,upst:CCND1:−377, int:CCNL1:1097, upst:CDK5R2:−648, upst:CCNL2:−7336,upst:CCND1:−2768, upst:CDK2:−1390, upst:CCNYL3:−8181, dst:CDKN2A:8650,upst:CDK8:−265, upst:CDK4:−4462, upst:CDKN2A:−44, int:CDKN2A:5270,upst:CCNJL:−2749, upst:CNNM4:−1843, upst:CDK5R2:−7376, int:CCNO:1417,upst:CDKN1C:−5, upst:CDKN1C:−6280, upst:ARF:−840, upst:CCND2:−1830,upst:CDK5R1:−206, upst:CCNA1:−1163, int:CCNE2:647, upst:CDK9:−909,upst:CCNYL3:−293, upst:CDKN3:−271, int:CCNT2:640, upst:CCND1:−2574,upst:CCNT2:−319, upst:CDK5R1:−3023, upst:CDK9:−3159, upst:CDK9:−8667,upst:CCNE2:−4956, int:CCND3:2384, upst:CDKN1B:−1362, upst:CCNI:−7899,upst:CCNT2:−6751, int:CDK5:1993, upst:CDK9:−8509, upst:CCND1:−7190,upst:CDKN1C:−7144, upst:CDKN3:−4479, upst:CCNB3:−3258, upst:CCND3:−9303,upst:CDK8:−8337, int:CDKN2C:643, upst:CCNYL3:−1019, upst:CDK5:−2373,int:CNNM4:1658, upst:CCNE2:−8552, upst:CCNG1:−9141, upst:CCND2:−4886,upst:CCNK:−8357, upst:CDK5:−9105, upst:CDKN2B:−108997, int:CCNB2:547,upst:CDKN3:−2291, dst:CDKN2A:30203, upst:CDK2:−5210, upst:CCNL1:−3430,upst:CCNF:−3964, upst:CCNK:−4426, upst:CCNF:−3743, upst:CDK5:−3754,upst:CDKN2B:−35359, upst:CDKN2B:−87467, upst:CDK5R2:−4915,upst:CCNF:−2075, upst:CDK6:−8726, upst:CDKN2B:−90566, int:CDKN2A:4904,int:CDKN2A:4432, upst:ARF:−2148, upst:CDKN2B:−130339, upst:CNNM3:−9238,upst:CCNG1:−4532, int:ARF:15754, upst:CCNF:−1085, upst:CDKN2B:−23831,upst:CDKN1A:−9569, int:CCNI:1874, dst:CDKN2A:45866, int:CCNC:816,upst:CCNC:−5405, upst:CDK4:−1632, upst:CCNK:−3241, upst:CDK10:−1805,upst:CCNJL:−671, upst:CDKN3:−5723, upst:CDKN2B:−15114, upst:CCNE2:−5939,upst:CCNJL:−7299, upst:CCND3:−4248, int:CDK9:1811, upst:CDKN2C:−8538,upst:ARF:−1395, upst:CCND2:−6904, upst:CDK4:−977, upst:CCNE1:−5422,upst:CCNE2:−2828, upst:CDK4:−2133, upst:CDK8:−9630, upst:CDK3:−4497,upst:CCND3:−6423, upst:CCND1:−8918, upst:CDKN2B:−119804,upst:CDKN3:−5438, upst:CDKN2C:−7397, upst:CCNYL1:−3709,upst:CDKL4:−6205, upst:CDKN1A:−2237, upst:CDKN1C:−4093,upst:CCND2:−9042, int:CDK8:566, upst:CDKN2B:−804, upst:CCNE1:−4445,upst:CDKN2B:−74328, upst:CDKN2B:−53107, upst:CCNE1:−9426,upst:CDKN2C:−3161, upst:CCNG2:−2953, upst:CNNM1:−2645,upst:CDKN1A:−1902, upst:CDKN3:−1974, upst:CDK10:−4173, upst:CDK9:−9782,upst:CDKN1C:−5693, upst:CDK5:−9871, upst:CNNM4:−4755,upst:CDKN2B:−31120, upst:CDK2:−8040, upst:CDKN2B:−75214,upst:CDKN2C:−127, upst:CDKN1C:−1017, and upst:CNNM4:−3840;polynucleotide fragments thereof, and variants comprising nucleotidesequences displaying at least about 80-100% sequence identity thereto,including any percent identity within this range, such as 81, 82, 83,84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99% sequenceidentity thereto. Biomarkers can be used alone or in combination withadditional biomarkers or relevant clinical parameters in prognosis,diagnosis, or monitoring treatment of cancer, stem cell therapy, orregenerative medicine.

Biomarker polynucleotides (e.g., lncRNAs) can be detected, for example,by microarray analysis, polymerase chain reaction (PCR), reversetranscriptase polymerase chain reaction (RT-PCR), Northern blot, serialanalysis of gene expression (SAGE), immunoassay, or mass spectrometry.

In one aspect, the invention provides a method for diagnosing cancer ina subject, comprising measuring the level of a plurality of biomarkersin a biological sample derived from a subject suspected of havingcancer, and analyzing the levels of the biomarkers and comparing withrespective reference value ranges for the biomarkers, whereindifferential expression of one or more biomarkers in the biologicalsample compared to one or more biomarkers in a control sample indicatesthat the subject has cancer. In one embodiment, the plurality ofbiomarkers comprises one or more lncRNAs selected from the groupconsisting of upst:CCNL1:−2767, int:CDKN1A:+885, upst: CDKN1A: −4845,upst:CDKN2B:−2,817, upst:CDK9:−9782, int:ARF:+4,517, int:ARF:+4530,upst:CDKN1C:−1017, int:CCNG1:+381, and upst:CCNG2:−2953. In certainembodiments, PANDA (upst:CDKN1A:−4845) is used alone or in combinationwith one or more additional biomarkers or relevant clinical parametersin prognosis, diagnosis, or monitoring treatment of cancer. In certainembodiments, the cancer comprises a mutation in the TP53 gene.

In certain embodiments, the level of one or more biomarkers is comparedwith reference value ranges for the biomarkers. The reference valueranges can represent the level of one or more biomarkers found in one ormore samples of one or more subjects without cancer (i.e., normal orcontrol samples). Alternatively, the reference values can represent thelevel of one or more biomarkers found in one or more samples of one ormore subjects with cancer. More specifically, the reference value rangescan represent the level of one or more biomarkers at particular stagesof disease (e.g., mild, moderate, or severe dysplasia, cancer in situ,or invasive cancer) to facilitate a determination of the stage ofdisease progression in an individual and an appropriate treatmentregimen.

In another embodiment, the invention includes a method for monitoringthe efficacy of a therapy for treating cancer in a subject, the methodcomprising: analyzing the level of each of one or more biomarkers insamples derived from the subject before and after the subject undergoessaid therapy, in conjunction with respective reference value ranges forsaid one or more biomarkers, wherein the one or more biomarkerscomprises one or more lncRNAs selected from the group consisting ofupst:CCNL1:−2767, int:CDKN1A:+885, upst: CDKN1A: −4845,upst:CDKN2B:−2,817, upst:CDK9:−9782, int:ARF:+4,517, int:ARF:+4530,upst:CDKN1C:−1017, int:CCNG1:+381, and upst:CCNG2:−2953.

In another embodiment, the invention includes a method for evaluatingthe effect of an agent for treating cancer in a subject, the methodcomprising: analyzing the level of each of one or more biomarkers insamples derived from the subject before and after the subject is treatedwith said agent, in conjunction with respective reference value rangesfor said one or more biomarkers, wherein one or more biomarkerscomprises one or more lncRNAs selected from the group consisting ofupst:CCNL1:−2767, int:CDKN1A:+885, upst: CDKN1A: −4845,upst:CDKN2B:−2,817, upst:CDK9:−9782, int:ARF:+4,517, int:ARF:+4530,upst:CDKN1C:−1017, int:CCNG1:+381, and upst:CCNG2:−2953.

In another aspect, the invention includes a method for monitoring tissueregeneration in a subject, the method comprising measuring the level ofa plurality of biomarkers in a biological sample derived from thesubject, wherein the plurality of biomarkers comprises one or morelncRNAs selected from the group consisting of upst:CCNG2:−2953, upst:CDKN1A: −4845, upst: CDKN1A: −9569, upst:CCNL1:−2767, int:CCNG1:+381,upst:CDK9:−9782, int:ARF:+4530, upst:CDKN1C:−1017; and analyzing thelevels of the biomarkers in conjunction with respective reference valueranges for said plurality of biomarkers, wherein differential expressionof one or more biomarkers in the biological sample compared to one ormore biomarkers in a control sample indicates whether the tissue isregenerating.

In another embodiment, the invention includes a method for monitoringcell differentiation in a tissue grown in culture, the method comprisingmeasuring the level of a plurality of biomarkers in a cell derived fromthe tissue, wherein the plurality of biomarkers comprises one or morelncRNAs selected from the group consisting of upst:CCNG2:−2953, upst:CDKN1A: −4845, upst: CDKN1A: −9569, upst:CCNL1:−2767, int:CCNG1:+381,upst:CDK9:−9782, int:ARF:+4530, upst:CDKN1C:−1017; and analyzing thelevels of the biomarkers in conjunction with respective reference valueranges for said plurality of biomarkers, wherein differential expressionof one or more biomarkers in the biological sample compared to one ormore biomarkers in a control sample indicates the state ofdifferentiation of the tissue. In certain embodiments, the tissue isderived from a stem cell. The stem cell can be an embryonic stem cell,an adult stem cell, or a cord blood stem cell, and can be totipotent,pluripotent, multipotent, or unipotent.

In another embodiment, the invention includes a method for evaluatingthe effect of an agent for regenerating tissue in a subject, the methodcomprising: analyzing the level of each of one or more biomarkers insamples derived from the subject before and after the subject is treatedwith said agent, in conjunction with respective reference value rangesfor said one or more biomarkers, wherein one or more biomarkerscomprises one or more lncRNAs selected from the group consisting ofupst:CCNG2:−2953, upst: CDKN1A: −4845, upst: CDKN1A: −9569,upst:CCNL1:−2767, int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530,upst:CDKN1C:−1017.

In another embodiment, the invention includes a method for monitoringthe efficacy of a therapy for regenerating tissue in a subject, themethod comprising: analyzing the level of each of one or more biomarkersin samples derived from the subject before and after the subjectundergoes said therapy, in conjunction with respective reference valueranges for said one or more biomarkers, wherein the one or morebiomarkers comprises one or more lncRNAs selected from the groupconsisting of upst:CCNG2:−2953, upst: CDKN1A: −4845, upst: CDKN1A:−9569, upst:CCNL1:−2767, int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530,upst:CDKN1C:−1017.

In another embodiment, the invention includes a method for evaluatingthe effect of an agent for inducing differentiation of a stem cell in asubject, the method comprising: analyzing the level of each of one ormore biomarkers in samples derived from the subject before and after thesubject is treated with said agent, in conjunction with respectivereference value ranges for said one or more biomarkers, wherein one ormore biomarkers comprises one or more lncRNAs selected from the groupconsisting of upst:CCNG2:−2953, upst: CDKN1A: −4845, upst: CDKN1A:−9569, upst:CCNL1:−2767, int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530,upst:CDKN1C:−1017.

In another embodiment, the invention includes a method for monitoringthe efficacy of stem cell therapy in a subject, the method comprising:analyzing the level of each of one or more biomarkers in samples derivedfrom the subject before and after the subject undergoes said stem celltherapy, in conjunction with respective reference value ranges for saidone or more biomarkers, wherein the one or more biomarkers comprises oneor more lncRNAs selected from the group consisting of upst:CCNG2:−2953,upst: CDKN1A: −4845, upst: CDKN1A: −9569, upst:CCNL1:−2767,int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530, upst:CDKN1C:−1017.

In another embodiment, the invention includes a method for evaluatingthe effect of an agent for inducing differentiation of a stem cell, themethod comprising growing the stem cell in culture; treating the culturewith the agent; measuring the level of a plurality of biomarkers in acultured cell derived from the stem cell after treating the culture withthe agent, wherein the plurality of biomarkers comprises one or morelncRNAs selected from the group consisting of upst:CCNG2:−2953, upst:CDKN1A: −4845, upst: CDKN1A: −9569, upst:CCNL1:−2767, int:CCNG1:+381,upst:CDK9:−9782, int:ARF:+4530, upst:CDKN1C:−1017; and analyzing thelevels of the biomarkers in conjunction with respective reference valueranges for said plurality of biomarkers.

In certain embodiments, a panel of biomarkers is used for diagnosingcancer or monitoring cancer progression, stem cell therapy, orregenerative medical treatments. Biomarker panels of any size can beused in the practice of the invention. Biomarker panels typicallycomprise at least 4 biomarkers and up to 30 biomarkers, including anynumber of biomarkers in between, such as 4, 5, 6, 7, 8, 9, 10, 11, 12,13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or30 biomarkers. In certain embodiments, the invention includes abiomarker panel comprising at least 4, or at least 5, or at least 6, orat least 7, or at least 8, or at least 9, or at least 10 or morebiomarkers. Although smaller biomarker panels are usually moreeconomical, larger biomarker panels (i.e., greater than 30 biomarkers)have the advantage of providing more detailed information and can alsobe used in the practice of the invention.

In certain embodiments, the invention includes a biomarker panelcomprising a plurality of lncRNAs selected from the group consisting ofint:CDK6:143, dst:CDKN2A:43877, upst:CCNF:−1721, upst:CCNI:−6398,upst:CCNI:−6621, upst:CCNI:−6883, upst:CDKN1A:−4845, upst:CDK5R1:−4044,upst:CDK5R1:−4410, upst:CCNL2:−1391, upst:CCNL2:−2253, upst:CCNL2:−767,int:CDKN2D:1417, upst:CCNL2:−5540, int:CDKN1A:1420, int:CCNT1:602,upst:CCNL2:−3110, upst:CDK5R1:−5717, upst:CCNL2:−982, upst:CCNE2:−682,int:CDK5R1:183, upst:CDK5R1:−482, upst:CDK8:−798, upst:CDK9:−646,upst:CDK6:−1860, int:CDK6:1276, upst:CDK6:−533, upst:CDKN2C:−8037,upst:CCNK:−899, upst:CNNM3:−248, upst:CDKN1C:−4619, int:CDKN2A:6667,int:ARF:4530, upst:CDKN2B:−15913, upst:CDK6:−1679, upst:CDKN1A:−1210,int:CDKN2B:1926, dst:CDKN2A:39498, upst:CCNL1:−1968, upst:CCNL1:−2234,upst:CCNL1:−2383, upst:CCNL1:−2767, upst:CDK5R2:−6418, upst:CDK4:−7794,upst:CDKN1A:−5830, int:CDKN2C:159, upst:CCNYL2:−36, upst:CCNC:−6760,upst:CDKN2B:−2817, upst:CNNM3:−970, upst:CDK5R2:−6045,upst:CDKN1C:−2196, int:CCND1:874, int:CCND2:1205, upst:CDKN1C:−446,int:CCNG2:390, upst:CDK3:−4148, upst:CCNA2:−250, int:CDKL5:64,upst:CCND2: 3165, int:CCNK:210, int:CDKN1A:885, upst:CDK5R2:−9197,int:CNNM3:1459, upst:CCND1:−1659, int:CCNL2:463, upst:CCNE1:−1190,upst:CDK5R2:−8037, upst:CDKL3:−867, int:CCNG1:381, upst:CCND2:−2874,upst:CDKN2B:−130736, int:CCNI:1042, upst:CCND2:−4757, int:CDK9:352,int:CCND2:1689, int:CDKL5:1682, upst:CDK5R2:−4541, upst:CDK5:−7855,upst:CDK9:−1536, upst:CCND2:−1291, upst:CCND1:−377, int:CCNL1:1097,upst:CDK5R2:−648, upst:CCNL2:−7336, upst:CCND1:−2768, upst:CDK2:−1390,upst:CCNYL3:−8181, dst:CDKN2A:8650, upst:CDK8:−265, upst:CDK4:−4462,upst:CDKN2A:−44, int:CDKN2A:5270, upst:CCNJL:−2749, upst:CNNM4:−1843,upst:CDK5R2:−7376, int:CCNO:1417, upst:CDKN1C:−5, upst:CDKN1C:−6280,upst:ARF:−840, upst:CCND2:−1830, upst:CDK5R1:−206, upst:CCNA1:−1163,int:CCNE2:647, upst:CDK9:−909, upst:CCNYL3:−293, upst:CDKN3:−271,int:CCNT2:640, upst:CCND1:−2574, upst:CCNT2:−319, upst:CDK5R1:−3023,upst:CDK9:−3159, upst:CDK9:−8667, upst:CCNE2:−4956, int:CCND3:2384,upst:CDKN1B:−1362, upst:CCNI:−7899, upst:CCNT2:−6751, int:CDK5:1993,upst:CDK9:−8509, upst:CCND1:−7190, upst:CDKN1C:−7144, upst:CDKN3:−4479,upst:CCNB3:−3258, upst:CCND3:−9303, upst:CDK8:−8337, int:CDKN2C:643,upst:CCNYL3:−1019, upst:CDK5:−2373, int:CNNM4:1658, upst:CCNE2:−8552,upst:CCNG1:−9141, upst:CCND2:−4886, upst:CCNK:−8357, upst:CDK5:−9105,upst:CDKN2B:−108997, int:CCNB2:547, upst:CDKN3:−2291, dst:CDKN2A:30203,upst:CDK2:−5210, upst:CCNL1:−3430, upst:CCNF:−3964, upst:CCNK:−4426,upst:CCNF:−3743, upst:CDK5:−3754, upst:CDKN2B:−35359,upst:CDKN2B:−87467, upst:CDK5R2:−4915, upst:CCNF:−2075, upst:CDK6:−8726,upst:CDKN2B:−90566, int:CDKN2A:4904, int:CDKN2A:4432, upst:ARF:−2148,upst:CDKN2B:−130339, upst:CNNM3:−9238, upst:CCNG1:−4532, int:ARF:15754,upst:CCNF:−1085, upst:CDKN2B:−23831, upst:CDKN1A:−9569, int:CCNI:1874,dst:CDKN2A:45866, int:CCNC:816, upst:CCNC:−5405, upst:CDK4:−1632,upst:CCNK:−3241, upst:CDK10:−1805, upst:CCNJL:−671, upst:CDKN3:−5723,upst:CDKN2B:−15114, upst:CCNE2:−5939, upst:CCNJL:−7299,upst:CCND3:−4248, int:CDK9:1811, upst:CDKN2C:−8538, upst:ARF:−1395,upst:CCND2:−6904, upst:CDK4:−977, upst:CCNE1:−5422, upst:CCNE2:−2828,upst:CDK4:−2133, upst:CDK8:−9630, upst:CDK3:−4497, upst:CCND3:−6423,upst:CCND1:−8918, upst:CDKN2B:−119804, upst:CDKN3:−5438,upst:CDKN2C:−7397, upst:CCNYL1:−3709, upst:CDKL4:−6205,upst:CDKN1A:−2237, upst:CDKN1C:−4093, upst:CCND2:−9042, int:CDK8:566,upst:CDKN2B:−804, upst:CCNE1:−4445, upst:CDKN2B:−74328,upst:CDKN2B:−53107, upst:CCNE1:−9426, upst:CDKN2C:−3161,upst:CCNG2:−2953, upst:CNNM1:−2645, upst:CDKN1A:−1902, upst:CDKN3:−1974,upst:CDK10:−4173, upst:CDK9:−9782, upst:CDKN1C:−5693, upst:CDK5:−9871,upst:CNNM4:−4755, upst:CDKN2B:−31120, upst:CDK2:−8040,upst:CDKN2B:−75214, upst:CDKN2C:−127, upst:CDKN1C:−1017, andupst:CNNM4:−3840.

In one embodiment, the invention includes a biomarker panel comprising aplurality of lncRNAs selected from the group consisting ofint:CDK6:1276, upst:CDK6:−533, upst:CDKN2C:−8037, int:CDKN2D:1417,upst:CCNL2:−5540, int:CDKN1A:1420, int:CCNT1:602, upst:CCNL2:−3110,upst:CDK5R1:−5717, upst:CCNL2:−982, upst:CCNE2:−682, int:CDK5R1:183,upst:CDK5R1:−482, upst:CDK8:−798, upst:CDK9:−646, upst:CDK6:−1860,int:CDKN2B:1926, int:CDK6:143, dst:CDKN2A:43877, upst:CCNF:−1721,upst:CCNI:−6398, upst:CCNI:−6621, upst:CCNI:−6883, upst:CDKN1A:−4845,upst:CDK5R1:−4044, upst:CDK5R1:−4410, upst:CCNL2:−1391,upst:CCNL2:−2253, upst:CCNL2:−767, dst:CDKN2A:39498, upst:CCNL1:−1968,upst:CCNL1:−2234, upst:CCNL1:−2383, upst:CCNL1:−2767, upst:CDK5R2:−6418,upst:CDK4:−7794, upst:CDKN1A:−5830, int:CDKN2C:159, upst:CCNYL2:−36,upst:CCNC:−6760, upst:CDKN2B:−2817, upst:CNNM3:−970, upst:CDK5R2:−6045,and upst:CDKN1C:−2196.

In another embodiment, the biomarker panel comprises upst:CCNG2:−2953,upst: CDKN1A: −4845, upst: CDKN1A: −9569, upst:CCNL1:−2767,int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530, and upst:CDKN1C:−1017.

In a further embodiment, the biomarker panel comprises upst:CCNL1:−2767,upst: CDKN1A: −4845, upst:CDK9:−9782, int:ARF:+4530, upst:CDKN1C:−1017,upst:CCNG2:−2953, int:CCNG1:+381.

In another aspect, the invention includes a method for treating cancercomprising administering to a subject in need thereof a therapeuticallyeffective amount of at least one chemotherapeutic agent in combinationwith a therapeutically effective amount of at least one PANDA inhibitor.Exemplary PANDA inhibitors include antisense oligonucleotides,inhibitory RNA molecules, such as miRNAs, siRNAs, piRNAs, and snRNAs,and ribozymes. In one embodiment, the inhibitory RNA molecule is ansiRNA comprising a nucleotide sequence selected from the groupconsisting of SEQ ID NOS:12-14.

In another embodiment, the invention includes a method for inhibitingPANDA in a subject comprising administering an effective amount of aPANDA inhibitor to the subject.

In another embodiment, the invention includes a method of increasing theactivity of the transcription factor NF-YA in a cell, the methodcomprising introducing an effective amount of a PANDA inhibitor into thecell.

In yet another aspect, the invention provides kits for use in diagnosingcancer or monitoring cancer progression, stem cell therapy orregenerative medical treatments in a subject. The kit may include atleast one agent that specifically detects an lncRNA biomarker, acontainer for holding a biological sample isolated from the subject, andprinted instructions for reacting the agent with the biological sampleor a portion of the biological sample to detect the presence or amountof at least one lncRNA biomarker in the biological sample. The agentsmay be packaged in separate containers. The kit may further comprise oneor more control reference samples and reagents for performing animmunoassay, microarray analysis, a Northern, PCR, or SAGE for detectionof biomarkers as described herein.

In yet another aspect, the invention provides kits comprisingcompositions containing PANDA, or at least one PANDA inhibitor, and/orat least one chemotherapeutic agent, or any combination thereof. The kitmay also include one or more transfection reagents to facilitatedelivery of oligonucleotides or polynucleotides to cells. The kit mayfurther contain means for administering a PANDA inhibitor to a subject.

These and other embodiments of the subject invention will readily occurto those of skill in the art in view of the disclosure herein.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1A-1E show the identification of ncRNAs near and within cell-cyclegenes.

FIG. 1A shows a flow chart of the strategy for systematic discovery ofcell-cycle ncRNAs. FIG. 1B shows a representative tiling array data. TheRNA hybridization intensity and H3K36me3 and H3K4me3 ChIP-chip signalsare shown relative to the input at the CCNE1 locus in human fetal lungfibroblasts. The predicted transcripts are shown in gray boxes. KnownmRNA exons are shown in black boxes. Each bar represents a significantpeak from one of the 108 array channels. FIG. 1C shows the chromatinstate at the transcribed regions. The average ChIP-chip signal is shownrelative to the input calculated across transcriptional peaks expressedin human fetal lung fibroblasts with or without doxorubicin treatment.FIG. 1D shows a codon substitution frequency (CSF) analysis with a graphof the average evolutionary CSF of the exons of coding genes and theirpredicted transcripts. A CSF <10 indicates no protein coding potential.FIG. 1E shows the transcriptional landscape of cell-cycle promoters. Wealigned all of the cell-cycle promoters at the TSS and calculated theaverage RNA hybridization signal across a 12 kb window. The outputrepresents a 150 bp running window of the average transcription signalsacross all 54 arrays.

FIGS. 2A and 2B show an analysis of ncRNA expression across diverse cellcycle perturbations. FIG. 2A shows a hierarchical clustering of 216predicted ncRNAs across 54 arrays, representing 108 conditions. Lightgray indicates that the cell cycle perturbation induced transcription ofthe ncRNA. Dark gray indicates that the cell cycle perturbationrepressed transcription of the ncRNA. Black indicates no significantexpression change. FIG. 2B shows a close up view of the ncRNAs incluster 1.

FIGS. 3A-3C show functional associations of the ncRNAs. FIG. 3A showslncRNA expression patterns do not correlate with those of the mRNAs incis. Histogram of Pearson correlations between each of the 216 ncRNAsand the cis mRNA across 108 samples are shown. FIG. 3B shows that lncRNAexpression patterns have a positive correlation with neighboring lncRNAtranscripts. Histogram of Pearson correlations between each of the 216ncRNAs and nearby transcripts on the same locus across 108 samples areshown. FIG. 3C shows that the genes co-expressed with lncRNAs areenriched for functional groups in the cell cycle and in the DNA damageresponse. A module map of lncRNA gene sets (columns) versus GeneOntology Biological Processes gene sets (rows) across 17 samples(P<0.05, false discovery rate <0.05) is shown. A light gray entryindicates that the Gene Ontology gene set is positively associated withthe lncRNA gene set. A dark gray entry indicates that the Gene Ontologygene set is negatively associated with the lncRNA gene set. A blackentry indicates no significant association. Representative enriched GeneOntology gene sets are listed.

FIGS. 4A-4D show validated expression of ncRNAs in cell cycleprogression, ESC differentiation and human cancers. We generated customTaqMan probes and used them to interrogate independent biologicalsamples for lncRNA expression. FIGS. 4A and 4B show periodic expressionof lncRNAs (dark gray) during synchronized cell cycle progression inHeLa cells (FIG. 4A) and foreskin fibroblasts (FIG. 4B). Cell cyclephases were confirmed by fluorescence-activated cell sorting andexpression of genes with known periodic expression in the cell cycle(light gray). FIG. 4C shows a comparison of regulated expression oflncRNAs in human ESCs and fetal pancreas (d, day). FIG. 4D shows acomparison of differential expression of lncRNAs in normal breastepithelium and breast cancer samples.

FIGS. 5A-5E show that ncRNAs at the CDKN1A locus are induced by DNAdamage. FIG. 5A shows: at the top, a map of all detected transcripts atthe CDKN1A promoter; in the middle, two tracks are examples of RNAhybridization intensity in the control or in human fetal lungfibroblasts treated with doxorubicin (dox) (200 ng/ml) for 24 hours.Note that we did not observe all DNA-damage-inducible transcripts in onesingle time point. At the bottom, the p53 ChIP-chip signal relative toinput confirmed the p53 binding site immediately upstream of the CDKN1ATSS after DNA damage. The RACE clone of upst:CDKN1A:−4,845 closelymatches the predicted transcript on the tiling array. FIG. 5B showsquantitative RT-PCR of lncRNAs with coordinate induction or repressionacross a 24 hour time course of doxorubicin treatment. A cluster oflncRNAs transcribed from the CDKN1A locus are induced. FIG. 5C shows theexpression of transcripts from the CDKN1A locus over a 24 hour timecourse after doxorubicin treatment of normal human fibroblasts (FL3).FIG. 5D shows an RNA blot of PANDA confirming that the transcript sizeof 1.5 kb. FIG. 5E shows that doxorubicin induction of PANDA requiresp53 but not CDKN1A. The mean±s.d. are shown (*P<0.05 relative to siCTRL(control siRNA) determined by student's t-test). FIG. 5F shows thatexpression of wild-type p53 in p53-null H1299 cells restores DNA damageinduction of CDKN1A and PANDA. The p53 (p.Val272Cys) loss-of-functionmutant fails to restore induction, whereas a gain-of-functionLi-Fraumeni allele, p53 (p.Arg273His), selectively retains the abilityto induce PANDA.

FIGS. 6A-6G show that the PANDA lncRNA regulates the apoptotic responseto DNA damage. FIG. 6A shows the results of siRNA knockdown of PANDA inthe presence of DNA damage with doxorubicin in human fibroblasts (FL3).Custom siRNAs specifically target PANDA with no discernable effect onthe LAP3 mRNA. The mean±s.d. is shown in all bar graphs (*P<0.05compared to siCTRL for all panels determined by Student's t-test). FIG.6B shows a heat map of gene expression changes with siPANDA relative tocontrol siRNA after 24 hours of doxorubicin treatment in FL3 cells. FIG.6C shows that quantitative RT-PCR of canonical apoptosis pathway genesrevealed induction with siPANDA relative to control siRNA after 28 hoursof doxorubicin treatment (in FL3 cells). FIG. 6D shows that quantitativeRT-PCR of CDKN1A and TP53 in FL3 cells revealed no reduction inexpression with siPANDA relative to control siRNA. FIG. 6E shows TUNELimmunofluorescence of control and siPANDA FL3 fibroblasts after 28 hoursof doxorubicin treatment (scale bar, 20 μm). FIG. 6F showsquantification of three independent TUNEL assays (P<0.05 for eachsiPANDA sample compared to siCTRL determined by student's t-test). FIG.6G shows a protein blot of PARP cleavage in control and PANDA siRNA FL3fibroblasts after 24 hours of doxorubicin treatment.

FIGS. 7A-7E show that PANDA regulates transcription factor NF-YA. FIG.7A shows RNA chromatography of PANDA from doxorubicin-treated FL3 celllysates. We visualized the retrieved proteins by immunoblot analysis.FIG. 7B shows that immunoprecipitation of NF-YA from doxorubicin-treatedFL3 lysates specifically retrieves PANDA, as measured by qRT-PCR. Theimmunoblot confirms immunoprecipitation of NF-YA, as shown at thebottom. FIG. 7C shows ChIP of NF-YA in FL3 fibroblasts nucleofected withsiCTRL or siPANDA. ChIP-qPCR is shown for known NF-YA target sites onpromoters of CCNB1, FAS, NOXA, BBC3 (PUMA) or a control downstreamregion in the FAS promoter lacking the NF-YA motif. Mean+s.d. is shownin all bar graphs (*P<0.05 determined by Student's t-test). FIG. 7Dshows that concomitant knockdown of NF-YA attenuates induction ofapoptotic genes by PANDA depletion, as measured by qRT-PCR. FIG. 7Eshows that concomitant knockdown of NF-YA rescues apoptosis induced byPANDA depletion. Quantification of TUNEL staining is shown. The legendfor this panel is as in FIG. 7D.

FIG. 8 shows a model of coding and noncoding transcripts at the CDKN1Alocus coordinating the DNA damage response. After DNA damage, p53binding at the CDKN1A locus coordinately activates transcription ofCDKN1A as well as noncoding transcripts PANDA and linc-p21. CDKN1Amediates cell cycle arrest; PANDA blocks apoptosis through NF-YA; andlinc-p21 mediates gene silencing through recruitment of hnRPK.

FIG. 9 shows a heatmap of lncRNAs expressed in each of the 104 differentRNA tiling arrays as determined by peak calling analysis.

FIG. 10 shows that RT-PCR validated the expression correlation between60 lncRNAs and their nearest 3′ and 5′ mRNAs across 34 RNA samples.

FIG. 11 shows gene sets of mRNAs positively or negatively correlatedwith each lncRNA as determined by pairwise Pearson correlation across 17tiling and expression arrays.

FIG. 12 shows a molecular Signature Data Base module map of gene setsassociated with lncRNAs.

FIG. 13 shows that PANDA is evolutionarily conserved across vertebratesas determined by 44 way Vertebrate Conservation PhastCon score.

FIGS. 14A and 14B show 24 hour DNA damage time courses of PANDA (FIG.14A) and LAP3 (FIG. 14B) expression. Human fetal lung fibroblasts (FL3)cells were treated with doxorubicin and collected at the indicated timepoints for RT-PCR analysis.

FIG. 15 shows p53-dependent DNA damage induction in a subset of lncRNAs.A heatmap is shown of lncRNA expression (as measured by RT-PCR) of humanfetal lung fibroblasts (FL3) treated with doxorubicin in the presence ofsiCTRL, siCDKN1A, or siTP53. Light gray indicates induction relative toundamaged cells. Dark gray indicates repression.

FIGS. 16A and 16B show PANDA expression levels in tumors. FIG. 16A showsa comparison of the expression in p53 mutant and p53 wild-type tumors.Human primary breast tumors were derived from the fresh-frozen tissuebank of the Netherlands Cancer Institute/Antoni van LeeuwenhoekHospital. TP53 mutations were identified by DNA sequencing of exons2-11. FIG. 16B shows a comparison of the expression of PANDA in 5 normalbreast tissues and 5 metastatic ductal carcinomas, also obtained fromthe same tissue depository as FIG. 16A.

FIG. 17 shows that three independent siRNAs to upst:CDKN1A:−800 did notinduce PARP cleavage in FL3 cells upon treatment with doxorubicin.

FIG. 18 shows the knockdown efficiency of NFYA and PANDA for FIGS. 7Dand 7E.

DETAILED DESCRIPTION

The practice of the present invention will employ, unless otherwiseindicated, conventional methods of pharmacology, chemistry,biochemistry, recombinant DNA techniques and immunology, within theskill of the art. Such techniques are explained fully in the literature.See, e.g., Handbook of Experimental Immunology, Vols. I-IV (D. M. Weirand C.C. Blackwell eds., Blackwell Scientific Publications); A. L.Lehninger, Biochemistry (Worth Publishers, Inc., current addition);Sambrook, et al., Molecular Cloning: A Laboratory Manual (3^(rd)Edition, 2001); Methods In Enzymology (S. Colowick and N. Kaplan eds.,Academic Press, Inc.).

All publications, patents and patent applications cited herein, whethersupra or infra, are hereby incorporated by reference in theirentireties.

I. DEFINITIONS

In describing the present invention, the following terms will beemployed, and are intended to be defined as indicated below.

It must be noted that, as used in this specification and the appendedclaims, the singular forms “a,” “an” and “the” include plural referentsunless the content clearly dictates otherwise. Thus, for example,reference to “an lncRNA” includes a mixture of two or more lncRNAs, andthe like.

The term “about,” particularly in reference to a given quantity, ismeant to encompass deviations of plus or minus five percent.

“PANDA” refers to upst:CDKN1A:−4845, also known as P21-Associated,Non-coding RNA, DNA damage Activated, a long non-coding RNA transcriptproduced from chromosome 6 at nucleotide positions 36749619-36750963. Arepresentative human sequence of PANDA is shown in SEQ ID NO:1.

The terms “microRNA,” “miRNA,” and MiR” are interchangeable and refer toendogenous or artificial non-coding RNAs that are capable of regulatinggene expression. It is believed that miRNAs function via RNAinterference. When used herein in the context of inactivation, the useof the term microRNAs is intended to include also long non-coding RNAs,piRNAs, siRNAs, and the like. Endogenous (e.g., naturally occurring)miRNAs are typically expressed from RNA polymerase II promoters and aregenerated from a larger transcript.

The terms “siRNA” and “short interfering RNA” are interchangeable andrefer to single-stranded or double-stranded RNA molecules that arecapable of inducing RNA interference. SiRNA molecules typically have aduplex region that is between 18 and 30 base pairs in length.

The terms “piRNA” and “Piwi-interacting RNA” are interchangeable andrefer to a class of small RNAs involved in gene silencing. PiRNAmolecules typically are between 26 and 31 nucleotides in length.

The terms “snRNA” and “small nuclear RNA” are interchangeable and referto a class of small RNAs involved in a variety of processes includingRNA splicing and regulation of transcription factors. The subclass ofsmall nucleolar RNAs (snoRNAs) is also included. The term is alsointended to include artificial snRNAs, such as antisense derivatives ofsnRNAs comprising antisense sequences directed against the lncRNA,PANDA.

The terms “polynucleotide,” “oligonucleotide,” “nucleic acid” and“nucleic acid molecule” are used herein to include a polymeric form ofnucleotides of any length, either ribonucleotides ordeoxyribonucleotides. This term refers only to the primary structure ofthe molecule. Thus, the term includes triple-, double- andsingle-stranded DNA, as well as triple-, double- and single-strandedRNA. It also includes modifications, such as by methylation and/or bycapping, and unmodified forms of the polynucleotide. More particularly,the terms “polynucleotide,” “oligonucleotide,” “nucleic acid” and“nucleic acid molecule” include polydeoxyribonucleotides (containing2-deoxy-D-ribose), polyribonucleotides (containing D-ribose), any othertype of polynucleotide which is an N- or C-glycoside of a purine orpyrimidine base, and other polymers containing normucleotidic backbones,for example, polyamide (e.g., peptide nucleic acids (PNAs)) andpolymorpholino (commercially available from the Anti-Virals, Inc.,Corvallis, Oreg., as Neugene) polymers, and other syntheticsequence-specific nucleic acid polymers providing that the polymerscontain nucleobases in a configuration which allows for base pairing andbase stacking, such as is found in DNA and RNA. There is no intendeddistinction in length between the terms “polynucleotide,”“oligonucleotide,” “nucleic acid” and “nucleic acid molecule,” and theseterms will be used interchangeably. Thus, these terms include, forexample, 3′-deoxy-2′,5′-DNA, oligodeoxyribonucleotide N3′ P5′phosphoramidates, 2′-O-alkyl-substituted RNA, double- andsingle-stranded DNA, as well as double- and single-stranded RNA,microRNA, DNA:RNA hybrids, and hybrids between PNAs and DNA or RNA, andalso include known types of modifications, for example, labels which areknown in the art, methylation, “caps,” substitution of one or more ofthe naturally occurring nucleotides with an analog (e.g.,2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyladenosine, C5-propynylcytidine, C5-propynyluridine, C5-bromouridine,C5-fluorouridine, C5-iodouridine, C5-methylcytidine, 7-deazaadenosine,7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, 0(6)-methylguanine,and 2-thiocytidine), internucleotide modifications such as, for example,those with uncharged linkages (e.g., methyl phosphonates,phosphotriesters, phosphoramidates, carbamates, etc.), with negativelycharged linkages (e.g., phosphorothioates, phosphorodithioates, etc.),and with positively charged linkages (e.g., aminoalklyphosphoramidates,aminoalkylphosphotriesters), those containing pendant moieties, such as,for example, proteins (including nucleases, toxins, antibodies, signalpeptides, poly-L-lysine, etc.), those with intercalators (e.g.,acridine, psoralen, etc.), those containing chelators (e.g., metals,radioactive metals, boron, oxidative metals, etc.), those containingalkylators, those with modified linkages (e.g., alpha anomeric nucleicacids, etc.), as well as unmodified forms of the polynucleotide oroligonucleotide. The term also includes locked nucleic acids (e.g.,comprising a ribonucleotide that has a methylene bridge between the2′-oxygen atom and the 4′-carbon atom). See, for example, Kurreck et al.(2002) Nucleic Acids Res. 30: 1911-1918; Elayadi et al. (2001) Curr.Opinion Invest. Drugs 2: 558-561; Orum et al. (2001) Curr. Opinion Mol.Ther. 3: 239-243; Koshkin et al. (1998) Tetrahedron 54: 3607-3630; Obikaet al. (1998) Tetrahedron Lett. 39: 5401-5404.

The term “homologous region” refers to a region of a nucleic acid withhomology to another nucleic acid region. Thus, whether a “homologousregion” is present in a nucleic acid molecule is determined withreference to another nucleic acid region in the same or a differentmolecule. Further, since a nucleic acid is often double-stranded, theterm “homologous, region,” as used herein, refers to the ability ofnucleic acid molecules to hybridize to each other. For example, asingle-stranded nucleic acid molecule can have two homologous regionswhich are capable of hybridizing to each other. Thus, the term“homologous region” includes nucleic acid segments with complementarysequence. Homologous regions may vary in length, but will typically bebetween 4 and 40 nucleotides (e.g., from about 4 to about 40, from about5 to about 40, from about 5 to about 35, from about 5 to about 30, fromabout 5 to about 20, from about 6 to about 30, from about 6 to about 25,from about 6 to about 15, from about 7 to about 18, from about 8 toabout 20, from about 8 to about 15, etc.).

The term “complementary” and “complementarity” are interchangeable andrefer to the ability of polynucleotides to form base pairs with oneanother. Base pairs are typically formed by hydrogen bonds betweennucleotide units in antiparallel polynucleotide strands or regions.Complementary polynucleotide strands or regions can base pair in theWatson-Crick manner (e.g., A to T, A to U, C to G). 100% complementaryrefers to the situation in which each nucleotide unit of onepolynucleotide strand or region can hydrogen bond with each nucleotideunit of a second polynucleotide strand or region. Less than perfectcomplementarity refers to the situation in which some, but not all,nucleotide units of two strands or two regions can hydrogen bond witheach other and can be expressed as a percentage.

A “target site” or “target sequence” is the nucleic acid sequencerecognized (i.e., sufficiently complementary for hybridization) by anantisense oligonucleotide or inhibitory RNA molecule.

The term “hairpin” and “stem-loop” can be used interchangeably and referto stem-loop structures. The stem results from two sequences of nucleicacid or modified nucleic acid annealing together to generate a duplex.The loop lies between the two strands comprising the stem.

The term “loop” refers to the part of the stem-loop between the twohomologous regions (the stem) that can loop around to allow base-pairingof the two homologous regions. The loop can be composed of nucleic acid(e.g., DNA or RNA) or non-nucleic acid material(s), referred to hereinas nucleotide or non-nucleotide loops. A non-nucleotide loop can also besituated at the end of a nucleotide molecule with or without a stemstructure.

“Administering” a nucleic acid, such as a microRNA, siRNA, piRNA, snRNA,antisense nucleic acid, or lncRNA to a cell comprises transducing,transfecting, electroporating, translocating, fusing, phagocytosing,shooting or ballistic methods, etc., i.e., any means by which a nucleicacid can be transported across a cell membrane.

The term “transfection” is used to refer to the uptake of foreign DNA orRNA by a cell. A cell has been “transfected” when exogenous DNA or RNAhas been introduced inside the cell membrane. A number of transfectiontechniques are generally known in the art. See, e.g., Graham et al.(1973) Virology, 52:456, Sambrook et al. (2001) Molecular Cloning, alaboratory manual, 3rd edition, Cold Spring Harbor Laboratories, NewYork, Davis et al. (1995) Basic Methods in Molecular Biology, 2ndedition, McGraw-Hill, and Chu et al. (1981) Gene 13:197. Such techniquescan be used to introduce one or more exogenous DNA or RNA moieties intosuitable host cells. The term refers to both stable and transient uptakeof the genetic material, and includes uptake, for example, of microRNA,siRNA, piRNA, lncRNA, or antisense nucleic acids.

“Pharmaceutically acceptable excipient or carrier” refers to anexcipient that may optionally be included in the compositions of theinvention and that causes no significant adverse toxicological effectsto the patient.

“Pharmaceutically acceptable salt” includes, but is not limited to,amino acid salts, salts prepared with inorganic acids, such as chloride,sulfate, phosphate, diphosphate, bromide, and nitrate salts, or saltsprepared from the corresponding inorganic acid form of any of thepreceding, e.g., hydrochloride, etc., or salts prepared with an organicacid, such as malate, maleate, fumarate, tartrate, succinate,ethylsuccinate, citrate, acetate, lactate, methanesulfonate, benzoate,ascorbate, para-toluenesulfonate, palmoate, salicylate and stearate, aswell as estolate, gluceptate and lactobionate salts. Similarly saltscontaining pharmaceutically acceptable cations include, but are notlimited to, sodium, potassium, calcium, aluminum, lithium, and ammonium(including substituted ammonium).

The terms “tumor,” “cancer” and “neoplasia” are used interchangeably andrefer to a cell or population of cells whose growth, proliferation orsurvival is greater than growth, proliferation or survival of a normalcounterpart cell, e.g. a cell proliferative, hyperproliferative ordifferentiative disorder. Typically, the growth is uncontrolled. Theterm “malignancy” refers to invasion of nearby tissue. The term“metastasis” or a secondary, recurring or recurrent tumor, cancer orneoplasia refers to spread or dissemination of a tumor, cancer orneoplasia to other sites, locations or regions within the subject, inwhich the sites, locations or regions are distinct from the primarytumor or cancer. Neoplasia, tumors and cancers include benign,malignant, metastatic and non-metastatic types, and include any stage(I, II, III, IV or V) or grade (G1, G2, G3, etc.) of neoplasia, tumor,or cancer, or a neoplasia, tumor, cancer or metastasis that isprogressing, worsening, stabilized or in remission. In particular, theterms “tumor,” “cancer” and “neoplasia” include carcinomas, such assquamous cell carcinoma, adenocarcinoma, adenosquamous carcinoma,anaplastic carcinoma, large cell carcinoma, and small cell carcinoma.These terms include, but are not limited to, breast cancer, prostatecancer, lung cancer, ovarian cancer, testicular cancer, colon cancer,pancreatic cancer, gastric cancer, hepatic cancer, leukemia, lymphoma,adrenal cancer, thyroid cancer, pituitary cancer, renal cancer, braincancer, skin cancer, head cancer, neck cancer, oral cavity cancer,tongue cancer, and throat cancer.

An “effective amount” of a PANDA inhibitor (e.g., microRNA, siRNA,piRNA, snRNA, antisense nucleic acid, ribozyme, or small moleculeinhibitor) is an amount sufficient to effect beneficial or desiredresults, such as an amount that inhibits the activity of the lncRNA,PANDA, for example by interfering with transcription of PANDA orinterfering with binding of PANDA to the transcription factor NF-YA. Aneffective amount can be administered in one or more administrations,applications, or dosages.

By “anti-tumor activity” is intended a reduction in the rate of cellproliferation, and hence a decline in growth rate of an existing tumoror in a tumor that arises during therapy, and/or destruction of existingneoplastic (tumor) cells or newly formed neoplastic cells, and hence adecrease in the overall size of a tumor during therapy. Such activitycan be assessed using animal models.

By “therapeutically effective dose or amount” of a PANDA inhibitor isintended an amount that, when administered as described herein, bringsabout a positive therapeutic response, such as anti-tumor activity. Theexact amount required will vary from subject to subject, depending onthe species, age, and general condition of the subject, the severity ofthe condition being treated, the particular drug or drugs employed, modeof administration, and the like. An appropriate “effective” amount inany individual case may be determined by one of ordinary skill in theart using routine experimentation, based upon the information providedherein.

The term “tumor response” as used herein means a reduction orelimination of all measurable lesions. The criteria for tumor responseare based on the WHO Reporting Criteria [WHO Offset Publication,48-World Health Organization, Geneva, Switzerland, (1979)]. Ideally, alluni- or bidimensionally measurable lesions should be measured at eachassessment. When multiple lesions are present in any organ, suchmeasurements may not be possible and, under such circumstances, up to 6representative lesions should be selected, if available.

The term “complete response” (CR) as used herein means a completedisappearance of all clinically detectable malignant disease, determinedby 2 assessments at least 4 weeks apart.

The term “partial response” (PR) as used herein means a 50% or greaterreduction from baseline in the sum of the products of the longestperpendicular diameters of all measurable disease without progression ofevaluable disease and without evidence of any new lesions as determinedby at least two consecutive assessments at least four weeks apart.Assessments should show a partial decrease in the size of lytic lesions,recalcifications of lytic lesions, or decreased density of blasticlesions.

“Substantially purified” generally refers to isolation of a substance(compound, polynucleotide, protein, polypeptide, polypeptidecomposition) such that the substance comprises the majority percent ofthe sample in which it resides. Typically in a sample, a substantiallypurified component comprises 50%, preferably 80%-85%, more preferably90-95% of the sample. Techniques for purifying polynucleotides andpolypeptides of interest are well-known in the art and include, forexample, ion-exchange chromatography, affinity chromatography andsedimentation according to density.

By “isolated” is meant, when referring to a polypeptide, that theindicated molecule is separate and discrete from the whole organism withwhich the molecule is found in nature or is present in the substantialabsence of other biological macro molecules of the same type. The term“isolated” with respect to a polynucleotide is a nucleic acid moleculedevoid, in whole or part, of sequences normally associated with it innature; or a sequence, as it exists in nature, but having heterologoussequences in association therewith; or a molecule disassociated from thechromosome.

“Homology” refers to the percent identity between two polynucleotide ortwo polypeptide moieties. Two nucleic acid, or two polypeptide sequencesare “substantially homologous” to each other when the sequences exhibitat least about 50% sequence identity, preferably at least about 75%sequence identity, more preferably at least about 80%-85% sequenceidentity, more preferably at least about 90% sequence identity, and mostpreferably at least about 95%-98% sequence identity over a definedlength of the molecules. As used herein, substantially homologous alsorefers to sequences showing complete identity to the specified sequence.

In general, “identity” refers to an exact nucleotide to nucleotide oramino acid to amino acid correspondence of two polynucleotides orpolypeptide sequences, respectively. Percent identity can be determinedby a direct comparison of the sequence information between two moleculesby aligning the sequences, counting the exact number of matches betweenthe two aligned sequences, dividing by the length of the shortersequence, and multiplying the result by 100. Readily available computerprograms can be used to aid in the analysis, such as ALIGN, Dayhoff, M.O. in Atlas of Protein Sequence and Structure M. O. Dayhoff ed., 5Suppl. 3:353 358, National biomedical Research Foundation, Washington,D.C., which adapts the local homology algorithm of Smith and WatermanAdvances in Appl. Math. 2:482 489, 1981 for peptide analysis. Programsfor determining nucleotide sequence identity are available in theWisconsin Sequence Analysis Package, Version 8 (available from GeneticsComputer Group, Madison, Wis.) for example, the BESTFIT, FASTA and GAPprograms, which also rely on the Smith and Waterman algorithm. Theseprograms are readily utilized with the default parameters recommended bythe manufacturer and described in the Wisconsin Sequence AnalysisPackage referred to above. For example, percent identity of a particularnucleotide sequence to a reference sequence can be determined using thehomology algorithm of Smith and Waterman with a default scoring tableand a gap penalty of six nucleotide positions.

Another method of establishing percent identity in the context of thepresent invention is to use the MPSRCH package of programs copyrightedby the University of Edinburgh, developed by John F. Collins and ShaneS. Sturrok, and distributed by IntelliGenetics, Inc. (Mountain View,Calif.). From this suite of packages the Smith Waterman algorithm can beemployed where default parameters are used for the scoring table (forexample, gap open penalty of 12, gap extension penalty of one, and a gapof six). From the data generated the “Match” value reflects “sequenceidentity.” Other suitable programs for calculating the percent identityor similarity between sequences are generally known in the art, forexample, another alignment program is BLAST, used with defaultparameters. For example, BLASTN and BLASTP can be used using thefollowing default parameters: genetic code=standard; filter=none;strand=both; cutoff=60; expect=10; Matrix=BLOSUM62; Descriptions=50sequences; sort by=HIGH SCORE; Databases=non redundant,GenBank+EMBL+DDBJ+PDB+GenBank CDS translations+Swissprotein+Spupdate+PIR. Details of these programs are readily available.

Alternatively, homology can be determined by hybridization ofpolynucleotides under conditions which form stable duplexes betweenhomologous regions, followed by digestion with single stranded specificnuclease(s), and size determination of the digested fragments. DNAsequences that are substantially homologous can be identified in aSouthern hybridization experiment under, for example, stringentconditions, as defined for that particular system. Defining appropriatehybridization conditions is within the skill of the art. See, e.g.,Sambrook et al., supra; DNA Cloning, supra; Nucleic Acid Hybridization,supra.

“Recombinant” as used herein to describe a nucleic acid molecule means apolynucleotide of genomic, cDNA, viral, semisynthetic, or syntheticorigin which, by virtue of its origin or manipulation, is not associatedwith all or a portion of the polynucleotide with which it is associatedin nature. The term “recombinant” as used with respect to a protein orpolypeptide means a polypeptide produced by expression of a recombinantpolynucleotide. In general, the gene of interest is cloned and thenexpressed in transformed organisms, as described further below. The hostorganism expresses the foreign gene to produce the protein underexpression conditions.

The term “transformation” refers to the insertion of an exogenouspolynucleotide into a host cell, irrespective of the method used for theinsertion. For example, direct uptake, transduction or f-mating areincluded. The exogenous polynucleotide may be maintained as anon-integrated vector, for example, a plasmid, or alternatively, may beintegrated into the host genome.

“Recombinant host cells”, “host cells,” “cells”, “cell lines,” “cellcultures”, and other such terms denoting microorganisms or highereukaryotic cell lines cultured as unicellular entities refer to cellswhich can be, or have been, used as recipients for recombinant vector orother transferred DNA, and include the original progeny of the originalcell which has been transfected.

“Operably linked” refers to an arrangement of elements wherein thecomponents so described are configured so as to perform their usualfunction. Thus, a given promoter operably linked to a coding sequence iscapable of effecting the expression of the coding sequence when theproper enzymes are present. Expression is meant to include thetranscription of any one or more of transcription of a microRNA, siRNA,piRNA, snRNA, lncRNA, antisense nucleic acid, or mRNA from a DNA or RNAtemplate and can further include translation of a protein from an mRNAtemplate. The promoter need not be contiguous with the coding sequence,so long as it functions to direct the expression thereof. Thus, forexample, intervening untranslated yet transcribed sequences can bepresent between the promoter sequence and the coding sequence and thepromoter sequence can still be considered “operably linked” to thecoding sequence.

“Purified polynucleotide” refers to a polynucleotide of interest orfragment thereof which is essentially free, e.g., contains less thanabout 50%, preferably less than about 70%, and more preferably less thanabout at least 90%, of the protein with which the polynucleotide isnaturally associated. Techniques for purifying polynucleotides ofinterest are well-known in the art and include, for example, disruptionof the cell containing the polynucleotide with a chaotropic agent andseparation of the polynucleotide(s) and proteins by ion-exchangechromatography, affinity chromatography and sedimentation according todensity.

A “vector” is capable of transferring nucleic acid sequences to targetcells (e.g., viral vectors, non-viral vectors, particulate carriers, andliposomes). Typically, “vector construct,” “expression vector,” and“gene transfer vector,” mean any nucleic acid construct capable ofdirecting the expression of a nucleic acid of interest and which cantransfer nucleic acid sequences to target cells. Thus, the term includescloning and expression vehicles, as well as viral vectors.

The terms “variant” refers to biologically active derivatives of thereference molecule that retain desired activity, such as RNAinterference (RNAi), lncRNA inhibition, or transcription factorinhibition. In general, the term “variant” refers to molecules (e.g.,lncRNAs, miRNAs, siRNAs, piRNAs, snRNAs, antisense nucleic acids, orother inhibitors of lncRNAs) having a native sequence and structure withone or more additions, substitutions (generally conservative in nature)and/or deletions, relative to the native molecule, so long as themodifications do not destroy biological activity and which are“substantially homologous” to the reference molecule. In general, thesequences of such variants will have a high degree of sequence homologyto the reference sequence, e.g., sequence homology of more than 50%,generally more than 60%-70%, even more particularly 80%-85% or more,such as at least 90%-95% or more, when the two sequences are aligned.

“Gene transfer” or “gene delivery” refers to methods or systems forreliably inserting DNA or RNA of interest into a host cell. Such methodscan result in transient expression of non-integrated transferred DNA,extrachromosomal replication and expression of transferred replicons(e.g., episomes), or integration of transferred genetic material intothe genomic DNA of host cells. Gene delivery expression vectors include,but are not limited to, vectors derived from bacterial plasmid vectors,viral vectors, non-viral vectors, alphaviruses, pox viruses and vacciniaviruses.

The term “derived from” is used herein to identify the original sourceof a molecule but is not meant to limit the method by which the moleculeis made which can be, for example, by chemical synthesis or recombinantmeans.

A polynucleotide “derived from” a designated sequence refers to apolynucleotide sequence which comprises a contiguous sequence ofapproximately at least about 6 nucleotides, preferably at least about 8nucleotides, more preferably at least about 10-12 nucleotides, and evenmore preferably at least about 15-20 nucleotides corresponding, i.e.,identical or complementary to, a region of the designated nucleotidesequence. The derived polynucleotide will not necessarily be derivedphysically from the nucleotide sequence of interest, but may begenerated in any manner, including, but not limited to, chemicalsynthesis, replication, reverse transcription or transcription, which isbased on the information provided by the sequence of bases in theregion(s) from which the polynucleotide is derived. As such, it mayrepresent either a sense or an antisense orientation of the originalpolynucleotide.

A “biomarker” in the context of the present invention refers to anlncRNA which is differentially expressed in a biological sample (e.g., abiopsy taken from a subject having cancer or a tissue undergoingregeneration or a stem cell undergoing differentiation) as compared to acontrol sample (e.g., a comparable sample taken from a person with anegative diagnosis, a normal or healthy subject, or normal, untreatedtissue or cells). The biomarker can be an lncRNA that can be detectedand/or quantified. Biomarkers include, but are not limited toint:CDK6:143, dst:CDKN2A:43877, upst:CCNF:−1721, upst:CCNI:−6398,upst:CCNI:−6621, upst:CCNI:−6883, upst:CDKN1A:−4845, upst:CDK5R1:−4044,upst:CDK5R1:−4410, upst:CCNL2:−1391, upst:CCNL2:−2253, upst:CCNL2:−767,int:CDKN2D:1417, upst:CCNL2:−5540, int:CDKN1A:1420, int:CCNT1:602,upst:CCNL2:−3110, upst:CDK5R1:−5717, upst:CCNL2:−982, upst:CCNE2:−682,int:CDK5R1:183, upst:CDK5R1:−482, upst:CDK8:−798, upst:CDK9:−646,upst:CDK6:−1860, int:CDK6:1276, upst:CDK6:−533, upst:CDKN2C:−8037,upst:CCNK:−899, upst:CNNM3:−248, upst:CDKN1C:−4619, int:CDKN2A:6667,int:ARF:4530, upst:CDKN2B:−15913, upst:CDK6:−1679, upst:CDKN1A:−1210,int:CDKN2B:1926, dst:CDKN2A:39498, upst:CCNL1:−1968, upst:CCNL1:−2234,upst:CCNL1:−2383, upst:CCNL1:−2767, upst:CDK5R2:−6418, upst:CDK4:−7794,upst:CDKN1A:−5830, int:CDKN2C:159, upst:CCNYL2:−36, upst:CCNC:−6760,upst:CDKN2B:−2817, upst:CNNM3:−970, upst:CDK5R2:−6045,upst:CDKN1C:−2196, int:CCND1:874, int:CCND2:1205, upst:CDKN1C:−446,int:CCNG2:390, upst:CDK3:−4148, upst:CCNA2:−250, int:CDKL5:64,upst:CCND2: 3165, int:CCNK:210, int:CDKN1A:885, upst:CDK5R2:−9197,int:CNNM3:1459, upst:CCND1:−1659, int:CCNL2:463, upst:CCNE1:−1190,upst:CDK5R2:−8037, upst:CDKL3:−867, int:CCNG1:381, upst:CCND2:−2874,upst:CDKN2B:−130736, int:CCNI:1042, upst:CCND2:−4757, int:CDK9:352,int:CCND2:1689, int:CDKL5:1682, upst:CDK5R2:−4541, upst:CDK5:−7855,upst:CDK9:−1536, upst:CCND2:−1291, upst:CCND1:−377, int:CCNL1:1097,upst:CDK5R2:−648, upst:CCNL2:−7336, upst:CCND1:−2768, upst:CDK2:−1390,upst:CCNYL3:−8181, dst:CDKN2A:8650, upst:CDK8:−265, upst:CDK4:−4462,upst:CDKN2A:−44, int:CDKN2A:5270, upst:CCNJL:−2749, upst:CNNM4:−1843,upst:CDK5R2:−7376, int:CCNO:1417, upst:CDKN1C:−5, upst:CDKN1C:−6280,upst:ARF:−840, upst:CCND2:−1830, upst:CDK5R1:−206, upst:CCNA1:−1163,int:CCNE2:647, upst:CDK9:−909, upst:CCNYL3:−293, upst:CDKN3:−271,int:CCNT2:640, upst:CCND1:−2574, upst:CCNT2:−319, upst:CDK5R1:−3023,upst:CDK9:−3159, upst:CDK9:−8667, upst:CCNE2:−4956, int:CCND3:2384,upst:CDKN1B:−1362, upst:CCNI:−7899, upst:CCNT2:−6751, int:CDK5:1993,upst:CDK9:−8509, upst:CCND1:−7190, upst:CDKN1C:−7144, upst:CDKN3:−4479,upst:CCNB3:−3258, upst:CCND3:−9303, upst:CDK8:−8337, int:CDKN2C:643,upst:CCNYL3:−1019, upst:CDK5:−2373, int:CNNM4:1658, upst:CCNE2:−8552,upst:CCNG1:−9141, upst:CCND2:−4886, upst:CCNK:−8357, upst:CDK5:−9105,upst:CDKN2B:−108997, int:CCNB2:547, upst:CDKN3:−2291, dst:CDKN2A:30203,upst:CDK2:−5210, upst:CCNL1:−3430, upst:CCNF:−3964, upst:CCNK:−4426,upst:CCNF:−3743, upst:CDK5:−3754, upst:CDKN2B:−35359,upst:CDKN2B:−87467, upst:CDK5R2:−4915, upst:CCNF:−2075, upst:CDK6:−8726,upst:CDKN2B:−90566, int:CDKN2A:4904, int:CDKN2A:4432, upst:ARF:−2148,upst:CDKN2B:−130339, upst:CNNM3:−9238, upst:CCNG1:−4532, int:ARF:15754,upst:CCNF:−1085, upst:CDKN2B:−23831, upst:CDKN1A:−9569, int:CCNI:1874,dst:CDKN2A:45866, int:CCNC:816, upst:CCNC:−5405, upst:CDK4:−1632,upst:CCNK:−3241, upst:CDK10:−1805, upst:CCNJL:−671, upst:CDKN3:−5723,upst:CDKN2B:−15114, upst:CCNE2:−5939, upst:CCNJL:−7299,upst:CCND3:−4248, int:CDK9:1811, upst:CDKN2C:−8538, upst:ARF:−1395,upst:CCND2:−6904, upst:CDK4:−977, upst:CCNE1:−5422, upst:CCNE2:−2828,upst:CDK4:−2133, upst:CDK8:−9630, upst:CDK3:−4497, upst:CCND3:−6423,upst:CCND1:−8918, upst:CDKN2B:−119804, upst:CDKN3:−5438,upst:CDKN2C:−7397, upst:CCNYL1:−3709, upst:CDKL4:−6205,upst:CDKN1A:−2237, upst:CDKN1C:−4093, upst:CCND2:−9042, int:CDK8:566,upst:CDKN2B:−804, upst:CCNE1:−4445, upst:CDKN2B:−74328,upst:CDKN2B:−53107, upst:CCNE1:−9426, upst:CDKN2C:−3161,upst:CCNG2:−2953, upst:CNNM1:−2645, upst:CDKN1A:−1902, upst:CDKN3:−1974,upst:CDK10:−4173, upst:CDK9:−9782, upst:CDKN1C:−5693, upst:CDK5:−9871,upst:CNNM4:−4755, upst:CDKN2B:−31120, upst:CDK2:−8040,upst:CDKN2B:−75214, upst:CDKN2C:−127, upst:CDKN1C:−1017, andupst:CNNM4:−3840.

The phrase “differentially expressed” refers to differences in thequantity and/or the frequency of a biomarker present in a sample takenfrom patients having, for example, cancer or undergoing tissueregeneration or stem cell therapy as compared to a control subject. Forexample, a biomarker can be an lncRNA which is present at an elevatedlevel or at a decreased level in samples of patients with cancer orundergoing tissue regeneration or stem cell therapy compared to samplesof control subjects. Alternatively, a biomarker can be an lncRNA whichis detected at a higher frequency or at a lower frequency in samples ofpatients with cancer or undergoing tissue regeneration or stem celltherapy compared to samples of control subjects or control tissues. Abiomarker can be differentially present in terms of quantity, frequencyor both.

An lncRNA is differentially expressed between two samples if the amountof the lncRNA in one sample is statistically significantly differentfrom the amount of the lncRNA in the other sample. For example, anlncRNA is differentially expressed in two samples if it is present atleast about 120%, at least about 130%, at least about 150%, at leastabout 180%, at least about 200%, at least about 300%, at least about500%, at least about 700%, at least about 900%, or at least about 1000%greater than it is present in the other sample, or if it is detectablein one sample and not detectable in the other.

Alternatively or additionally, an lncRNA is differentially expressed intwo sets of samples if the frequency of detecting the lncRNA in samples(e.g., tissue or cells from patient suffering from cancer, undergoingstem cell therapy, or regenerative medical treatment) is statisticallysignificantly higher or lower than in the control samples. For example,an lncRNA is differentially expressed in two sets of samples if it isdetected at least about 120%, at least about 130%, at least about 150%,at least about 180%, at least about 200%, at least about 300%, at leastabout 500%, at least about 700%, at least about 900%, or at least about1000% more frequently or less frequently observed in one set of samplesthan the other set of samples.

The terms “subject,” “individual,” and “patient,” are usedinterchangeably herein and refer to any mammalian subject for whomdiagnosis, prognosis, treatment, or therapy is desired, particularlyhumans. Other subjects may include cattle, dogs, cats, guinea pigs,rabbits, rats, mice, horses, and so on. In some cases, the methods ofthe invention find use in experimental animals, in veterinaryapplication, and in the development of animal models for disease,including, but not limited to, rodents including mice, rats, andhamsters; primates, and transgenic animals.

As used herein, a “biological sample” refers to a sample of tissue orfluid isolated from a subject, including but not limited to, forexample, urine, blood, plasma, serum, fecal matter, bone marrow, bile,spinal fluid, lymph fluid, samples of the skin, external secretions ofthe skin, respiratory, intestinal, and genitourinary tracts, tears,saliva, milk, blood cells, organs, biopsies, and also samples containingcells or tissues derived from the subject and grown in culture, and invitro cell culture constituents, including but not limited to,conditioned media resulting from the growth of cells and tissues inculture, recombinant cells, stem cells, and cell components.

The term “stem cell” refers to a cell that retains the ability to renewitself through mitotic cell division and that can differentiate into adiverse range of specialized cell types. Mammalian stem cells can bedivided into three broad categories: embryonic stem cells, which arederived from blastocysts, adult stem cells, which are found in adulttissues, and cord blood stem cells, which are found in the umbilicalcord. In a developing embryo, stem cells can differentiate into all ofthe specialized embryonic tissues. In adult organisms, stem cells andprogenitor cells act as a repair system for the body by replenishingspecialized cells. Totipotent stem cells are produced from the fusion ofan egg and sperm cell. Cells produced by the first few divisions of thefertilized egg are also totipotent. These cells can differentiate intoembryonic and extraembryonic cell types. Pluripotent stem cells are thedescendants of totipotent cells and can differentiate into cells derivedfrom any of the three germ layers. Multipotent stem cells can produceonly cells of a closely related family of cells (e.g., hematopoieticstem cells differentiate into red blood cells, white blood cells,platelets, etc.). Unipotent cells can produce only one cell type, buthave the property of self-renewal, which distinguishes them fromnon-stem cells.

The terms “quantity,” “amount,” and “level” are used interchangeablyherein and may refer to an absolute quantification of a molecule or ananalyte in a sample, or to a relative quantification of a molecule oranalyte in a sample, i.e., relative to another value such as relative toa reference value as taught herein, or to a range of values for thebiomarker. These values or ranges can be obtained from a single patientor from a group of patients.

A “test amount” of a biomarker refers to an amount of a biomarkerpresent in a sample being tested. A test amount can be either anabsolute amount (e.g., μg/ml) or a relative amount (e.g., relativeintensity of signals).

A “diagnostic amount” of a biomarker refers to an amount of a biomarkerin a subject's sample that is consistent with a diagnosis of cancer. Adiagnostic amount can be either an absolute amount (e.g., μg/ml) or arelative amount (e.g., relative intensity of signals).

A “control amount” of a marker can be any amount or a range of amountwhich is to be compared against a test amount of a biomarker. Forexample, a control amount of a biomarker can be the amount of abiomarker in a person without cancer, or normal tissue or cells, oruntreated tissue or cells. A control amount can be either in absoluteamount (e.g., μg/ml) or a relative amount (e.g., relative intensity ofsignals).

The term “antibody” encompasses polyclonal and monoclonal antibodypreparations, as well as preparations including hybrid antibodies,altered antibodies, chimeric antibodies and, humanized antibodies, aswell as: hybrid (chimeric) antibody molecules (see, for example, Winteret al. (1991) Nature 349:293-299; and U.S. Pat. No. 4,816,567); F(ab′)2and F(ab) fragments; Fv molecules (noncovalent heterodimers, see, forexample, Inbar et al. (1972) Proc Natl Acad Sci USA 69:2659-2662; andEhrlich et al. (1980) Biochem 19:4091-4096); single-chain Fv molecules(sFv) (see, e.g., Huston et al. (1988) Proc Natl Acad Sci USA85:5879-5883); dimeric and trimeric antibody fragment constructs;minibodies (see, e.g., Pack et al. (1992) Biochem 31:1579-1584; Cumberet al. (1992) J Immunology 149B:120-126); humanized antibody molecules(see, e.g., Riechmann et al. (1988) Nature 332:323-327; Verhoeyan et al.(1988) Science 239:1534-1536; and U.K. Patent Publication No. GB2,276,169, published 21 Sep. 1994); and, any functional fragmentsobtained from such molecules, wherein such fragments retainspecific-binding properties of the parent antibody molecule.

“Immunoassay” is an assay that uses an antibody to specifically bind anantigen (e.g., a biomarker). The immunoassay is characterized by the useof specific binding properties of a particular antibody to isolate,target, and/or quantify the antigen. An immunoassay for a biomarker mayutilize one antibody or several antibodies. Immunoassay protocols may bebased, for example, upon competition, direct reaction, or sandwich typeassays using, for example, labeled antibody. The labels may be, forexample, fluorescent, chemiluminescent, or radioactive.

The phrase “specifically (or selectively) binds” to an antibody or“specifically (or selectively) immunoreactive with,” when referring to abiomarker, refers to a binding reaction that is determinative of thepresence of the biomarker in a heterogeneous population of proteins,nucleic acids, and other biologics. Thus, under designated immunoassayconditions, the specified antibodies bind to a particular biomarker atleast two times the background and do not substantially bind in asignificant amount to other nucleic acids present in the sample.Specific binding to an antibody under such conditions may require anantibody that is selected for its specificity for a particular lncRNA.For example, polyclonal antibodies raised to a biomarker from specificspecies such as rat, mouse, or human can be selected to obtain onlythose polyclonal antibodies that are specifically immunoreactive withthe biomarker and not with other nucleic acids, except for polymorphicvariants and alleles of the biomarker. This selection may be achieved bysubtracting out antibodies that cross-react with biomarker moleculesfrom other species. A variety of immunoassay formats may be used toselect antibodies specifically immunoreactive with a particularbiomarker. For example, solid-phase ELISA immunoassays are routinelyused to select antibodies specifically immunoreactive with an antigen(see, e.g., Harlow & Lane. Antibodies, A Laboratory Manual (1988), for adescription of immunoassay formats and conditions that can be used todetermine specific immunoreactivity). Typically a specific or selectivereaction will be at least twice background signal or noise and moretypically more than 10 to 100 times background.

“Capture reagent” refers to a molecule or group of molecules thatspecifically bind to a specific target molecule or group of targetmolecules. For example, a capture reagent can comprise two or moreantibodies each antibody having specificity for a separate targetmolecule. Capture reagents can be any combination of organic orinorganic chemicals, or biomolecules, and all fragments, analogs,homologs, conjugates, and derivatives thereof that can specifically binda target molecule.

The capture reagent can comprise a single molecule that can form acomplex with multiple targets, for example, a multimeric fusion proteinwith multiple binding sites for different targets. The capture reagentcan comprise multiple molecules each having specificity for a differenttarget, thereby resulting in multiple capture reagent-target complexes.In certain embodiments, the capture reagent is comprised of proteins,such as antibodies.

The capture reagent can be directly labeled with a detectable moiety.For example, an anti-biomarker antibody can be directly conjugated to adetectable moiety and used in the inventive methods, devices, and kits.In the alternative, detection of the capture reagent-biomarker complexcan be by a secondary reagent that specifically binds to the biomarkeror the capture reagent-biomarker complex. The secondary reagent can beany biomolecule, and is preferably an antibody. The secondary reagent islabeled with a detectable moiety. In some embodiments, the capturereagent or secondary reagent is coupled to biotin, and contacted withavidin or streptavidin having a detectable moiety tag.

“Detectable moieties” or “detectable labels” contemplated for use in theinvention include, but are not limited to, radioisotopes, fluorescentdyes such as fluorescein, phycoerythrin, Cy-3, Cy-5, allophycoyanin,DAPI, Texas Red, rhodamine, Oregon green, Lucifer yellow, and the like,green fluorescent protein (GFP), red fluorescent protein (DsRed), CyanFluorescent Protein (CFP), Yellow Fluorescent Protein (YFP), CerianthusOrange Fluorescent Protein (cOFP), alkaline phosphatase (AP),beta-lactamase, chloramphenicol acetyltransferase (CAT), adenosinedeaminase (ADA), aminoglycoside phosphotransferase (neor, G418r)dihydrofolate reductase (DHFR), hygromycin-B-phosphotransferase (HPH),thymidine kinase (TK), lacZ (encoding alpha-galactosidase), and xanthineguanine phosphoribosyltransferase (XGPRT), Beta-Glucuronidase (gus),Placental Alkaline Phosphatase (PLAP), Secreted Embryonic AlkalinePhosphatase (SEAP), or Firefly or Bacterial Luciferase (LUC). Enzymetags are used with their cognate substrate. The terms also includecolor-coded microspheres of known fluorescent light intensities (seee.g., microspheres with xMAP technology produced by Luminex (Austin,Tex.); microspheres containing quantum dot nanocrystals, for example,containing different ratios and combinations of quantum dot colors(e.g., Qdot nanocrystals produced by Life Technologies (Carlsbad,Calif.); glass coated metal nanoparticles (see e.g., SERS nanotagsproduced by Nanoplex Technologies, Inc. (Mountain View, Calif.); barcodematerials (see e.g., sub-micron sized striped metallic rods such asNanobarcodes produced by Nanoplex Technologies, Inc.), encodedmicroparticles with colored bar codes (see e.g., CellCard produced byVitra Bioscience, vitrabio.com), and glass microparticles with digitalholographic code images (see e.g., CyVera microbeads produced byIllumina (San Diego, Calif.). As with many of the standard proceduresassociated with the practice of the invention, skilled artisans will beaware of additional labels that can be used.

“Diagnosis” as used herein generally includes determination as towhether a subject is likely affected by a given disease, disorder ordysfunction. The skilled artisan often makes a diagnosis on the basis ofone or more diagnostic indicators, i.e., a biomarker, the presence,absence, or amount of which is indicative of the presence or absence ofthe disease, disorder or dysfunction.

“Prognosis” as used herein generally refers to a prediction of theprobable course and outcome of a clinical condition or disease. Aprognosis of a patient is usually made by evaluating factors or symptomsof a disease that are indicative of a favorable or unfavorable course oroutcome of the disease. It is understood that the term “prognosis” doesnot necessarily refer to the ability to predict the course or outcome ofa condition with 100% accuracy. Instead, the skilled artisan willunderstand that the term “prognosis” refers to an increased probabilitythat a certain course or outcome will occur; that is, that a course oroutcome is more likely to occur in a patient exhibiting a givencondition, when compared to those individuals not exhibiting thecondition.

II. MODES OF CARRYING OUT THE INVENTION

Before describing the present invention in detail, it is to beunderstood that this invention is not limited to particular formulationsor process parameters as such may, of course, vary. It is also to beunderstood that the terminology used herein is for the purpose ofdescribing particular embodiments of the invention only, and is notintended to be limiting.

Although a number of methods and materials similar or equivalent tothose described herein can be used in the practice of the presentinvention, the preferred materials and methods are described herein.

The present invention is based on the discovery of lncRNAs that playroles in regulation of genes involved in cell proliferation,differentiation, and apoptosis. Such lncRNAs can be used as biomarkersto monitor cell proliferation and differentiation during cancerprogression or tissue regeneration. In particular, the inentors haveshown that an lncRNA, referred to as PANDA (a P21-Associated NcRNA, DNAdamage Activated), inhibits the expression of apoptotic genes normallyactivated by the transcription factor NF-YA. The inventors have furthershown that inhibitors of PANDA sensitize cancerous cells to chemotherapyand can be used in combination with chemotherapeutic agents for treatingcancer (see Example 1). In order to further an understanding of theinvention, a more detailed discussion is provided below regarding theidentified lncRNAs and their diagnostic and therapeutic uses for cancer,stem cell therapy, and regenerative medicine.

A. Biomarkers

Biomarkers that can be used in the practice of the invention includelncRNAs such as, but not limited to int:CDK6:143, dst:CDKN2A:43877,upst:CCNF:−1721, upst:CCNI:−6398, upst:CCNI:−6621, upst:CCNI:−6883,upst:CDKN1A:−4845, upst:CDK5R1:−4044, upst:CDK5R1:−4410,upst:CCNL2:−1391, upst:CCNL2:−2253, upst:CCNL2:−767, int:CDKN2D:1417,upst:CCNL2:−5540, int:CDKN1A:1420, int:CCNT1:602, upst:CCNL2:−3110,upst:CDK5R1:−5717, upst:CCNL2:−982, upst:CCNE2:−682, int:CDK5R1:183,upst:CDK5R1:−482, upst:CDK8:−798, upst:CDK9:−646, upst:CDK6:−1860,int:CDK6:1276, upst:CDK6:−533, upst:CDKN2C:−8037, upst:CCNK:−899,upst:CNNM3:−248, upst:CDKN1C:−4619, int:CDKN2A:6667, int:ARF:4530,upst:CDKN2B:−15913, upst:CDK6:−1679, upst:CDKN1A:−1210, int:CDKN2B:1926,dst:CDKN2A:39498, upst:CCNL1:−1968, upst:CCNL1:−2234, upst:CCNL1:−2383,upst:CCNL1:−2767, upst:CDK5R2:−6418, upst:CDK4:−7794, upst:CDKN1A:−5830,int:CDKN2C:159, upst:CCNYL2:−36, upst:CCNC:−6760, upst:CDKN2B:−2817,upst:CNNM3:−970, upst:CDK5R2:−6045, upst:CDKN1C:−2196, int:CCND1:874,int:CCND2:1205, upst:CDKN1C:−446, int:CCNG2:390, upst:CDK3:−4148,upst:CCNA2:−250, int:CDKL5:64, upst:CCND2: 3165, int:CCNK:210,int:CDKN1A:885, upst:CDK5R2:−9197, int:CNNM3:1459, upst:CCND1:−1659,int:CCNL2:463, upst:CCNE1:−1190, upst:CDK5R2:−8037, upst:CDKL3:−867,int:CCNG1:381, upst:CCND2:−2874, upst:CDKN2B:−130736, int:CCNI:1042,upst:CCND2:−4757, int:CDK9:352, int:CCND2:1689, int:CDKL5:1682,upst:CDK5R2:−4541, upst:CDK5:−7855, upst:CDK9:−1536, upst:CCND2:−1291,upst:CCND1:−377, int:CCNL1:1097, upst:CDK5R2:−648, upst:CCNL2:−7336,upst:CCND1:−2768, upst:CDK2:−1390, upst:CCNYL3:−8181, dst:CDKN2A:8650,upst:CDK8:−265, upst:CDK4:−4462, upst:CDKN2A:−44, int:CDKN2A:5270,upst:CCNJL:−2749, upst:CNNM4:−1843, upst:CDK5R2:−7376, int:CCNO:1417,upst:CDKN1C:−5, upst:CDKN1C:−6280, upst:ARF:−840, upst:CCND2:−1830,upst:CDK5R1:−206, upst:CCNA1:−1163, int:CCNE2:647, upst:CDK9:−909,upst:CCNYL3:−293, upst:CDKN3:−271, int:CCNT2:640, upst:CCND1:−2574,upst:CCNT2:−319, upst:CDK5R1:−3023, upst:CDK9:−3159, upst:CDK9:−8667,upst:CCNE2:−4956, int:CCND3:2384, upst:CDKN1B:−1362, upst:CCNI:−7899,upst:CCNT2:−6751, int:CDK5:1993, upst:CDK9:−8509, upst:CCND1:−7190,upst:CDKN1C:−7144, upst:CDKN3:−4479, upst:CCNB3:−3258, upst:CCND3:−9303,upst:CDK8:−8337, int:CDKN2C:643, upst:CCNYL3:−1019, upst:CDK5:−2373,int:CNNM4:1658, upst:CCNE2:−8552, upst:CCNG1:−9141, upst:CCND2:−4886,upst:CCNK:−8357, upst:CDK5:−9105, upst:CDKN2B:−108997, int:CCNB2:547,upst:CDKN3:−2291, dst:CDKN2A:30203, upst:CDK2:−5210, upst:CCNL1:−3430,upst:CCNF:−3964, upst:CCNK:−4426, upst:CCNF:−3743, upst:CDK5:−3754,upst:CDKN2B:−35359, upst:CDKN2B:−87467, upst:CDK5R2:−4915,upst:CCNF:−2075, upst:CDK6:−8726, upst:CDKN2B:−90566, int:CDKN2A:4904,int:CDKN2A:4432, upst:ARF:−2148, upst:CDKN2B:−130339, upst:CNNM3:−9238,upst:CCNG1:−4532, int:ARF:15754, upst:CCNF:−1085, upst:CDKN2B:−23831,upst:CDKN1A:−9569, int:CCNI:1874, dst:CDKN2A:45866, int:CCNC:816,upst:CCNC:−5405, upst:CDK4:−1632, upst:CCNK:−3241, upst:CDK10:−1805,upst:CCNJL:−671, upst:CDKN3:−5723, upst:CDKN2B:−15114, upst:CCNE2:−5939,upst:CCNJL:−7299, upst:CCND3:−4248, int:CDK9:1811, upst:CDKN2C:−8538,upst:ARF:−1395, upst:CCND2:−6904, upst:CDK4:−977, upst:CCNE1:−5422,upst:CCNE2:−2828, upst:CDK4:−2133, upst:CDK8:−9630, upst:CDK3:−4497,upst:CCND3:−6423, upst:CCND1:−8918, upst:CDKN2B:−119804,upst:CDKN3:−5438, upst:CDKN2C:−7397, upst:CCNYL1:−3709,upst:CDKL4:−6205, upst:CDKN1A:−2237, upst:CDKN1C:−4093,upst:CCND2:−9042, int:CDK8:566, upst:CDKN2B:−804, upst:CCNE1:−4445,upst:CDKN2B:−74328, upst:CDKN2B:−53107, upst:CCNE1:−9426,upst:CDKN2C:−3161, upst:CCNG2:−2953, upst:CNNM1:−2645,upst:CDKN1A:−1902, upst:CDKN3:−1974, upst:CDK10:−4173, upst:CDK9:−9782,upst:CDKN1C:−5693, upst:CDK5:−9871, upst:CNNM4:−4755,upst:CDKN2B:−31120, upst:CDK2:−8040, upst:CDKN2B:−75214,upst:CDKN2C:−127, upst:CDKN1C:−1017, and upst:CNNM4:−3840;polynucleotide fragments thereof, and variants comprising nucleotidesequences displaying at least about 80-100% sequence identity thereto,including any percent identity within this range, such as 81, 82, 83,84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99% sequenceidentity thereto. Differential expression of these biomarkers isassociated with cell proliferation, differentiation, or apoptosis, andtherefore expression profiles of these biomarkers are useful fordiagnosing cancer and monitoring differentiation and regeneration oftissues and cells.

Accordingly, in one aspect, the invention provides a method fordiagnosing cancer in a subject, comprising measuring the level of aplurality of biomarkers in a biological sample derived from a subjectsuspected of having cancer, and analyzing the levels of the biomarkersand comparing with respective reference value ranges for the biomarkers,wherein differential expression of one or more biomarkers in thebiological sample compared to one or more biomarkers in a control sampleindicates that the subject has cancer. The biomarkers can be used aloneor in combination with relevant clinical parameters in prognosis,diagnosis, or monitoring treatment of cancer. In one embodiment, theplurality of biomarkers comprises one or more lncRNAs selected from thegroup consisting of upst:CCNL1:−2767, int:CDKN1A:+885, upst: CDKN1A:−4845, upst:CDKN2B:−2,817, upst:CDK9:−9782, int:ARF:+4,517,int:ARF:+4530, upst:CDKN1C:−1017, int:CCNG1:+381, and upst:CCNG2:−2953.In another embodiment, PANDA is used alone or in combination with one ormore additional biomarkers or clinical parameters in diagnosing cancer.In certain embodiments, the cancer comprises a mutation in the TP53gene.

When analyzing the levels of biomarkers in a biological sample, thereference value ranges used for comparison can represent the level ofone or more biomarkers found in one or more samples of one or moresubjects without cancer (i.e., normal or control samples).Alternatively, the reference values can represent the level of one ormore biomarkers found in one or more samples of one or more subjectswith cancer. More specifically, the reference value ranges can representthe level of one or more biomarkers at particular stages of disease(e.g., mild, moderate, or severe dysplasia, cancer in situ, or invasivecancer) to facilitate a determination of the stage of diseaseprogression in an individual.

In another embodiment, the invention includes a method for monitoringthe efficacy of a therapy for treating cancer in a subject, the methodcomprising: analyzing the level of each of one or more biomarkers insamples derived from the subject before and after the subject undergoessaid therapy, in conjunction with respective reference value ranges forsaid one or more biomarkers, wherein the one or more biomarkerscomprises one or more lncRNAs selected from the group consisting ofupst:CCNL1:−2767, int:CDKN1A:+885, upst: CDKN1A: −4845,upst:CDKN2B:−2,817, upst:CDK9:−9782, int:ARF:+4,517, int:ARF:+4530,upst:CDKN1C:−1017, int:CCNG1:+381, and upst:CCNG2:−2953.

In another embodiment, the invention includes a method for evaluatingthe effect of an agent for treating cancer in a subject, the methodcomprising: analyzing the level of each of one or more biomarkers insamples derived from the subject before and after the subject is treatedwith said agent, in conjunction with respective reference value rangesfor said one or more biomarkers, wherein one or more biomarkerscomprises one or more lncRNAs selected from the group consisting ofupst:CCNL1:−2767, int:CDKN1A:+885, upst: CDKN1A: −4845,upst:CDKN2B:−2,817, upst:CDK9:−9782, int:ARF:+4,517, int:ARF:+4530,upst:CDKN1C:−1017, int:CCNG1:+381, and upst:CCNG2:−2953.

In another aspect, the invention includes a method for monitoring tissueregeneration in a subject, the method comprising measuring the level ofa plurality of biomarkers in a biological sample derived from thesubject, wherein the plurality of biomarkers comprises one or morelncRNAs selected from the group consisting of upst:CCNG2:−2953, upst:CDKN1A: −4845, upst: CDKN1A: −9569, upst:CCNL1:−2767, int:CCNG1:+381,upst:CDK9:−9782, int:ARF:+4530, upst:CDKN1C:−1017; and analyzing thelevels of the biomarkers in conjunction with respective reference valueranges for said plurality of biomarkers.

In another embodiment, the invention includes a method for monitoringcell differentiation in a tissue grown in culture, the method comprisingmeasuring the level of a plurality of biomarkers in a cell derived fromthe tissue, wherein the plurality of biomarkers comprises one or morelncRNAs selected from the group consisting of upst:CCNG2:−2953, upst:CDKN1A: −4845, upst: CDKN1A: −9569, upst:CCNL1:−2767, int:CCNG1:+381,upst:CDK9:−9782, int:ARF:+4530, upst:CDKN1C:−1017; and analyzing thelevels of the biomarkers in conjunction with respective reference valueranges for said plurality of biomarkers. In certain embodiments, thetissue is derived from a stem cell. The stem cell can be an embryonicstem cell, an adult stem cell, or a cord blood stem cell, and can betotipotent, pluripotent, multipotent, or unipotent.

In another embodiment, the invention includes a method for evaluatingthe effect of an agent for regenerating tissue in a subject, the methodcomprising: analyzing the level of each of one or more biomarkers insamples derived from the subject before and after the subject is treatedwith said agent, in conjunction with respective reference value rangesfor said one or more biomarkers, wherein one or more biomarkerscomprises one or more lncRNAs selected from the group consisting ofupst:CCNG2:−2953, upst: CDKN1A: −4845, upst: CDKN1A: −9569,upst:CCNL1:−2767, int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530,upst:CDKN1C:−1017.

In another embodiment, the invention includes a method for monitoringthe efficacy of a therapy for regenerating tissue in a subject, themethod comprising: analyzing the level of each of one or more biomarkersin samples derived from the subject before and after the subjectundergoes said therapy, in conjunction with respective reference valueranges for said one or more biomarkers, wherein the one or morebiomarkers comprises one or more lncRNAs selected from the groupconsisting of upst:CCNG2:−2953, upst: CDKN1A: −4845, upst: CDKN1A:−9569, upst:CCNL1:−2767, int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530,upst:CDKN1C:−1017.

In another embodiment, the invention includes a method for evaluatingthe effect of an agent for inducing differentiation of a stem cell in asubject, the method comprising: analyzing the level of each of one ormore biomarkers in samples derived from the subject before and after thesubject is treated with said agent, in conjunction with respectivereference value ranges for said one or more biomarkers, wherein one ormore biomarkers comprises one or more lncRNAs selected from the groupconsisting of upst:CCNG2:−2953, upst: CDKN1A: −4845, upst: CDKN1A:−9569, upst:CCNL1:−2767, int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530,upst:CDKN1C:−1017.

In another embodiment, the invention includes a method for monitoringthe efficacy of stem cell therapy in a subject, the method comprising:analyzing the level of each of one or more biomarkers in samples derivedfrom the subject before and after the subject undergoes said stem celltherapy, in conjunction with respective reference value ranges for saidone or more biomarkers, wherein the one or more biomarkers comprises oneor more lncRNAs selected from the group consisting of upst:CCNG2:−2953,upst: CDKN1A: −4845, upst: CDKN1A: −9569, upst:CCNL1:−2767,int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530, upst:CDKN1C:−1017.

In another embodiment, the invention includes a method for evaluatingthe effect of an agent for inducing differentiation of a stem cell, themethod comprising growing the stem cell in culture; treating the culturewith the agent; measuring the level of a plurality of biomarkers in acultured cell derived from the stem cell after treating the culture withthe agent, wherein the plurality of biomarkers comprises one or morelncRNAs selected from the group consisting of upst:CCNG2:−2953, upst:CDKN1A: −4845, upst: CDKN1A: −9569, upst:CCNL1:−2767, int:CCNG1:+381,upst:CDK9:−9782, int:ARF:+4530, upst:CDKN1C:−1017; and analyzing thelevels of the biomarkers in conjunction with respective reference valueranges for said plurality of biomarkers.

In cases in which biomarkers are used to monitor stem therapy or cell ortissue differentiation, the reference value ranges used for comparisoncan represent the level of one or more biomarkers found in one or moresamples of one or more healthy or untreated subjects or normal oruntreated tissues or cells (i.e., normal or control samples).Alternatively, the reference values can represent the level of one ormore biomarkers found in one or more samples of one or more subjects inneed of stem cell therapy or regenerative medical treatment. Morespecifically, the reference value ranges can represent the level of oneor more biomarkers in tissues or cells at particular stages ofdifferentiation or treatment to aid in determining an appropriatetreatment regimen.

In cases in which the subject is being diagnosed for cancer, thebiological sample obtained from the subject to be diagnosed is typicallya biopsy of abnormal tissue suspected of containing cancerous ordysplastic cells, but can be any sample of tissue or cells that containsthe expressed biomarkers. In cases in which the subject is undergoingstem cell therapy or regenerative medical treatment, the biologicalsample may include samples from in vitro cell culture resulting from thegrowth of cells, tissues, or organs, which are to be transferred to thesubject, in culture, or a biopsy of tissue from the subject. Thebiological sample can be obtained from the subject by conventionaltechniques. For example, samples of tissue or cells can be obtained bysurgical techniques well known in the art.

In certain embodiments, the biological sample may comprise a tissuesample including a portion, piece, part, segment, or fraction of atissue which is obtained or removed from an intact tissue of a subject.Tissue samples can be obtained, for example, from the breast, pancreas,stomach, liver, secretory gland, bladder, lung, prostate gland, ovary,cervix, uterus, brain, eye, connective tissue, bone, muscles,vasculature, skin, oral cavity, tongue, head, neck, or throat. A tissuebiopsy may be obtained by methods including, but not limited to, anaspiration biopsy, a brush biopsy, a surface biopsy, a needle biopsy, apunch biopsy, an excision biopsy, an open biopsy, an incision biopsy oran endoscopic biopsy.

In certain embodiments, the biological sample is a tumor sample,including the entire tumor or a portion, piece, part, segment, orfraction of a tumor. A tumor sample can be obtained from a solid tumoror from a non-solid tumor, for example, from a squamous cell carcinoma,skin carcinoma, oral cavity carcinoma, head carcinoma, throat carcinoma,neck carcinoma, breast carcinoma, lung carcinoma, basal cell carcinoma,a colon carcinoma, a cervical carcinoma, Kaposi sarcoma, prostatecarcinoma, an adenocarcinoma, a melanoma, hemangioma, meningioma,astrocytoma, neuroblastoma, carcinoma of the pancreas, gastriccarcinoma, colorectal carcinoma, colon carcinoma, transitional cellcarcinoma of the bladder, carcinoma of the larynx, chronic myeloidleukemia, acute lymphocytic leukemia, acute promyelocytic leukemia,multiple myeloma, T-cell lymphoma, B-cell lymphomas, retinoblastoma,sarcoma gallbladder, or bronchial cancer. The tumor sample may beobtained from a primary tumor or from a metastatic lesion.

In other embodiments, the biological sample is a stem cell, a populationof stem cells, or a differentiated cell, tissue or organ derived fromstem cells. Stem cells may be embryonic stem cells, adult stem cells, orcord blood stem cells, and may be totipotent, pluripotent, multipotent,or unipotent.

A “control” sample as used herein refers to a biological sample, such astissue or cells that are not diseased. That is, a control sample isobtained from a normal subject (e.g. an individual known to not havecancer or dysplasia or any condition or symptom associated with abnormalcell maturation or proliferation).

In certain embodiments, a panel of biomarkers is used for diagnosingcancer or monitoring cancer progression, stem cell therapy orregenerative medical treatments. Biomarker panels of any size can beused in the practice of the invention. Biomarker panels typicallycomprise at least 4 biomarkers and up to 30 biomarkers, including anynumber of biomarkers in between, such as 4, 5, 6, 7, 8, 9, 10, 11, 12,13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or30 biomarkers. In certain embodiments, the invention includes abiomarker panel comprising at least 4, or at least 5, or at least 6, orat least 7, or at least 8, or at least 9, or at least 10 or morebiomarkers. Although smaller biomarker panels are usually moreeconomical, larger biomarker panels (i.e., greater than 30 biomarkers)have the advantage of providing more detailed information and can alsobe used in the practice of the invention.

In certain embodiments, the invention includes a biomarker panelcomprising a plurality of lncRNAs selected from the group consisting ofint:CDK6:143, dst:CDKN2A:43877, upst:CCNF:−1721, upst:CCNI:−6398,upst:CCNI:−6621, upst:CCNI:−6883, upst:CDKN1A:−4845, upst:CDK5R1:−4044,upst:CDK5R1:−4410, upst:CCNL2:−1391, upst:CCNL2:−2253, upst:CCNL2:−767,int:CDKN2D:1417, upst:CCNL2:−5540, int:CDKN1A:1420, int:CCNT1:602,upst:CCNL2:−3110, upst:CDK5R1:−5717, upst:CCNL2:−982, upst:CCNE2:−682,int:CDK5R1:183, upst:CDK5R1:−482, upst:CDK8:−798, upst:CDK9:−646,upst:CDK6:−1860, int:CDK6:1276, upst:CDK6:−533, upst:CDKN2C:−8037,upst:CCNK:−899, upst:CNNM3:−248, upst:CDKN1C:−4619, int:CDKN2A:6667,int:ARF:4530, upst:CDKN2B:−15913, upst:CDK6:−1679, upst:CDKN1A:−1210,int:CDKN2B:1926, dst:CDKN2A:39498, upst:CCNL1:−1968, upst:CCNL1:−2234,upst:CCNL1:−2383, upst:CCNL1:−2767, upst:CDK5R2:−6418, upst:CDK4:−7794,upst:CDKN1A:−5830, int:CDKN2C:159, upst:CCNYL2:−36, upst:CCNC:−6760,upst:CDKN2B:−2817, upst:CNNM3:−970, upst:CDK5R2:−6045,upst:CDKN1C:−2196, int:CCND1:874, int:CCND2:1205, upst:CDKN1C:−446,int:CCNG2:390, upst:CDK3:−4148, upst:CCNA2:−250, int:CDKL5:64,upst:CCND2: 3165, int:CCNK:210, int:CDKN1A:885, upst:CDK5R2:−9197,int:CNNM3:1459, upst:CCND1:−1659, int:CCNL2:463, upst:CCNE1:−1190,upst:CDK5R2:−8037, upst:CDKL3:−867, int:CCNG1:381, upst:CCND2:−2874,upst:CDKN2B:−130736, int:CCNI:1042, upst:CCND2:−4757, int:CDK9:352,int:CCND2:1689, int:CDKL5:1682, upst:CDK5R2:−4541, upst:CDK5:−7855,upst:CDK9:−1536, upst:CCND2:−1291, upst:CCND1:−377, int:CCNL1:1097,upst:CDK5R2:−648, upst:CCNL2:−7336, upst:CCND1:−2768, upst:CDK2:−1390,upst:CCNYL3:−8181, dst:CDKN2A:8650, upst:CDK8:−265, upst:CDK4:−4462,upst:CDKN2A:−44, int:CDKN2A:5270, upst:CCNJL:−2749, upst:CNNM4:−1843,upst:CDK5R2:−7376, int:CCNO:1417, upst:CDKN1C:−5, upst:CDKN1C:−6280,upst:ARF:−840, upst:CCND2:−1830, upst:CDK5R1:−206, upst:CCNA1:−1163,int:CCNE2:647, upst:CDK9:−909, upst:CCNYL3:−293, upst:CDKN3:−271,int:CCNT2:640, upst:CCND1:−2574, upst:CCNT2:−319, upst:CDK5R1:−3023,upst:CDK9:−3159, upst:CDK9:−8667, upst:CCNE2:−4956, int:CCND3:2384,upst:CDKN1B:−1362, upst:CCNI:−7899, upst:CCNT2:−6751, int:CDK5:1993,upst:CDK9:−8509, upst:CCND1:−7190, upst:CDKN1C:−7144, upst:CDKN3:−4479,upst:CCNB3:−3258, upst:CCND3:−9303, upst:CDK8:−8337, int:CDKN2C:643,upst:CCNYL3:−1019, upst:CDK5:−2373, int:CNNM4:1658, upst:CCNE2:−8552,upst:CCNG1:−9141, upst:CCND2:−4886, upst:CCNK:−8357, upst:CDK5:−9105,upst:CDKN2B:−108997, int:CCNB2:547, upst:CDKN3:−2291, dst:CDKN2A:30203,upst:CDK2:−5210, upst:CCNL1:−3430, upst:CCNF:−3964, upst:CCNK:−4426,upst:CCNF:−3743, upst:CDK5:−3754, upst:CDKN2B:−35359,upst:CDKN2B:−87467, upst:CDK5R2:−4915, upst:CCNF:−2075, upst:CDK6:−8726,upst:CDKN2B:−90566, int:CDKN2A:4904, int:CDKN2A:4432, upst:ARF:−2148,upst:CDKN2B:−130339, upst:CNNM3:−9238, upst:CCNG1:−4532, int:ARF:15754,upst:CCNF:−1085, upst:CDKN2B:−23831, upst:CDKN1A:−9569, int:CCNI:1874,dst:CDKN2A:45866, int:CCNC:816, upst:CCNC:−5405, upst:CDK4:−1632,upst:CCNK:−3241, upst:CDK10:−1805, upst:CCNJL:−671, upst:CDKN3:−5723,upst:CDKN2B:−15114, upst:CCNE2:−5939, upst:CCNJL:−7299,upst:CCND3:−4248, int:CDK9:1811, upst:CDKN2C:−8538, upst:ARF:−1395,upst:CCND2:−6904, upst:CDK4:−977, upst:CCNE1:−5422, upst:CCNE2:−2828,upst:CDK4:−2133, upst:CDK8:−9630, upst:CDK3:−4497, upst:CCND3:−6423,upst:CCND1:−8918, upst:CDKN2B:−119804, upst:CDKN3:−5438,upst:CDKN2C:−7397, upst:CCNYL1:−3709, upst:CDKL4:−6205,upst:CDKN1A:−2237, upst:CDKN1C:−4093, upst:CCND2:−9042, int:CDK8:566,upst:CDKN2B:−804, upst:CCNE1:−4445, upst:CDKN2B:−74328,upst:CDKN2B:−53107, upst:CCNE1:−9426, upst:CDKN2C:−3161,upst:CCNG2:−2953, upst:CNNM1:−2645, upst:CDKN1A:−1902, upst:CDKN3:−1974,upst:CDK10:−4173, upst:CDK9:−9782, upst:CDKN1C:−5693, upst:CDK5:−9871,upst:CNNM4:−4755, upst:CDKN2B:−31120, upst:CDK2:−8040,upst:CDKN2B:−75214, upst:CDKN2C:−127, upst:CDKN1C:−1017, andupst:CNNM4:−3840.

In one embodiment, the invention includes a biomarker panel comprising aplurality of lncRNAs selected from the group consisting ofint:CDK6:1276, upst:CDK6:−533, upst:CDKN2C:−8037, int:CDKN2D:1417,upst:CCNL2:−5540, int:CDKN1A:1420, int:CCNT1:602, upst:CCNL2:−3110,upst:CDK5R1:−5717, upst:CCNL2:−982, upst:CCNE2:−682, int:CDK5R1:183,upst:CDK5R1:−482, upst:CDK8:−798, upst:CDK9:−646, upst:CDK6:−1860,int:CDKN2B:1926, int:CDK6:143, dst:CDKN2A:43877, upst:CCNF:−1721,upst:CCNI:−6398, upst:CCNI:−6621, upst:CCNI:−6883, upst:CDKN1A:−4845,upst:CDK5R1:−4044, upst:CDK5R1:−4410, upst:CCNL2:−1391,upst:CCNL2:−2253, upst:CCNL2:−767, dst:CDKN2A:39498, upst:CCNL1:−1968,upst:CCNL1:−2234, upst:CCNL1:−2383, upst:CCNL1:−2767, upst:CDK5R2:−6418,upst:CDK4:−7794, upst:CDKN1A:−5830, int:CDKN2C:159, upst:CCNYL2:−36,upst:CCNC:−6760, upst:CDKN2B:−2817, upst:CNNM3:−970, upst:CDK5R2:−6045,and upst:CDKN1C:−2196.

In another embodiment, the biomarker panel comprises upst:CCNG2:−2953,upst: CDKN1A: −4845, upst: CDKN1A: −9569, upst:CCNL1:−2767,int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530, and upst:CDKN1C:−1017.In a further embodiment, the biomarker panel comprises upst:CCNL1:−2767,upst: CDKN1A: −4845, upst:CDK9:−9782, int:ARF:+4530, upst:CDKN1C:−1017,upst:CCNG2:−2953, int:CCNG1:+381.

The methods of the invention, as described herein, can also be used fordetermining the prognosis of a subject and for monitoring treatment of asubject having cancer. The inventors have shown that some lncRNAs,including upst:CCNL1:−2,767 and int:CDKN1A:+885 are repressed inmetastatic breast cancers relative to normal mammary tissues, whereasothers, including upst:CDKN1A:−4,845, upst:CDKN2B:−2,817 andint:ARF:+4,517, are induced (See Example 1). Thus, a medicalpractitioner can monitor the progress of disease by measuring the levelsof these lncRNAs in a biological sample from the patient. For example,an increase in a CCNL1:−2,767 or int:CDKN1A:+885 level as compared to aprior level (e.g., in a prior biological sample from the same area oflesion) indicates the disease or condition in the subject is improvingor has improved, while a decrease of the CCNL1:−2,767 or int:CDKN1A:+885level as compared to a prior level (e.g., in a prior biological samplefrom the same area of lesion) indicates the disease or condition in thesubject has worsened or is worsening. In another example, a decrease ina CDKN1A:−4,845, upst:CDKN2B:−2,817 or int:ARF:+4,517 level as comparedto a prior level (e.g., in a prior biological sample from the same areaof lesion) indicates the disease or condition in the subject isimproving or has improved, while an increase of the CDKN1A:−4,845,upst:CDKN2B:−2,817 and int:ARF:+4,517 level as compared to a prior level(e.g., in a prior biological sample from the same area of lesion)indicates the disease or condition in the subject has worsened or isworsening.

The methods described herein for prognosis or diagnosis of cancer may beused in individuals who have not yet been diagnosed (for example,preventative screening), or who have been diagnosed, or who aresuspected of having cancer (e.g., display one or more characteristicsymptoms), or who are at risk of developing cancer (e.g., have a geneticpredisposition or presence of one or more developmental, environmental,or behavioral risk factors). The methods may also be used to detectvarious stages of progression or severity of disease. The methods mayalso be used to detect the response of disease to prophylactic ortherapeutic treatments or other interventions. The methods canfurthermore be used to help the medical practitioner in determiningprognosis (e.g., worsening, status-quo, partial recovery, or completerecovery) of the patient, and the appropriate course of action,resulting in either further treatment or observation, or in discharge ofthe patient from the medical care center.

B. Detecting and Measuring Levels of Biomarkers

It is understood that the expression level of the biomarkers in a samplecan be determined by any suitable method known in the art. Measurementof the level of a biomarker can be direct or indirect. For example, theabundance levels of lncRNAs can be directly quantitated. Alternatively,the amount of a biomarker can be determined indirectly by measuringabundance levels of cDNAs, amplified RNAs or DNAs, or by measuringquantities or activities of RNAs, or other molecules that are indicativeof the expression level of the biomarker.

LncRNAs can be detected and quantitated by a variety of methodsincluding, but not limited to, microarray analysis, polymerase chainreaction (PCR), reverse transcriptase polymerase chain reaction(RT-PCR), Northern blot, serial analysis of gene expression (SAGE),immunoassay, and mass spectrometry. See, e.g., Draghici Data AnalysisTools for DNA Microarrays, Chapman and Hall/CRC, 2003; Simon et al.Design and Analysis of DNA Microarray Investigations, Springer, 2004;Real-Time PCR: Current Technology and Applications, Logan, Edwards, andSaunders eds., Caister Academic Press, 2009; Bustin A-Z of QuantitativePCR (IUL Biotechnology, No. 5), International University Line, 2004;Velculescu et al. (1995) Science 270: 484-487; Matsumura et al. (2005)Cell. Microbiol. 7: 11-18; Serial Analysis of Gene Expression (SAGE):Methods and Protocols (Methods in Molecular Biology), Humana Press,2008, Hoffmann and Stroobant Mass Spectrometry: Principles andApplications, Third Edition, Wiley, 2007; herein incorporated byreference in their entireties.

In one embodiment, microarrays are used to measure the levels ofbiomarkers. An advantage of microarray analysis is that the expressionof each of the biomarkers can be measured simultaneously, andmicroarrays can be specifically designed to provide a diagnosticexpression profile for a particular disease or condition (e.g., cancer,regenerative medicine).

Microarrays are prepared by selecting probes which comprise apolynucleotide sequence, and then immobilizing such probes to a solidsupport or surface. For example, the probes may comprise DNA sequences,RNA sequences, or copolymer sequences of DNA and RNA. The polynucleotidesequences of the probes may also comprise DNA and/or RNA analogues, orcombinations thereof. For example, the polynucleotide sequences of theprobes may be full or partial fragments of genomic DNA. Thepolynucleotide sequences of the probes may also be synthesizednucleotide sequences, such as synthetic oligonucleotide sequences. Theprobe sequences can be synthesized either enzymatically in vivo,enzymatically in vitro (e.g., by PCR), or non-enzymatically in vitro.

Probes used in the methods of the invention are preferably immobilizedto a solid support which may be either porous or non-porous. Forexample, the probes may be polynucleotide sequences which are attachedto a nitrocellulose or nylon membrane or filter covalently at either the3′ or the 5′ end of the polynucleotide. Such hybridization probes arewell known in the art (see, e.g., Sambrook, et al., Molecular Cloning: ALaboratory Manual (3rd Edition, 2001). Alternatively, the solid supportor surface may be a glass or plastic surface. In one embodiment,hybridization levels are measured to microarrays of probes consisting ofa solid phase on the surface of which are immobilized a population ofpolynucleotides, such as a population of DNA or DNA mimics, or,alternatively, a population of RNA or RNA mimics. The solid phase may bea nonporous or, optionally, a porous material such as a gel.

In one embodiment, the microarray comprises a support or surface with anordered array of binding (e.g., hybridization) sites or “probes” eachrepresenting one of the biomarkers described herein. Preferably themicroarrays are addressable arrays, and more preferably positionallyaddressable arrays. More specifically, each probe of the array ispreferably located at a known, predetermined position on the solidsupport such that the identity (i.e., the sequence) of each probe can bedetermined from its position in the array (i.e., on the support orsurface). Each probe is preferably covalently attached to the solidsupport at a single site.

Microarrays can be made in a number of ways, of which several aredescribed below. However they are produced, microarrays share certaincharacteristics. The arrays are reproducible, allowing multiple copiesof a given array to be produced and easily compared with each other.Preferably, microarrays are made from materials that are stable underbinding (e.g., nucleic acid hybridization) conditions. Microarrays aregenerally small, e.g., between 1 cm² and 25 cm²; however, larger arraysmay also be used, e.g., in screening arrays. Preferably, a given bindingsite or unique set of binding sites in the microarray will specificallybind (e.g., hybridize) to the product of a single gene in a cell (e.g.,to a specific mRNA, lncRNA, or to a specific cDNA derived therefrom).However, in general, other related or similar sequences will crosshybridize to a given binding site.

As noted above, the “probe” to which a particular polynucleotidemolecule specifically hybridizes contains a complementary polynucleotidesequence. The probes of the microarray typically consist of nucleotidesequences of no more than 1,000 nucleotides. In some embodiments, theprobes of the array consist of nucleotide sequences of 10 to 1,000nucleotides. In one embodiment, the nucleotide sequences of the probesare in the range of 10-200 nucleotides in length and are genomicsequences of one species of organism, such that a plurality of differentprobes is present, with sequences complementary and thus capable ofhybridizing to the genome of such a species of organism, sequentiallytiled across all or a portion of the genome. In other embodiments, theprobes are in the range of 10-30 nucleotides in length, in the range of10-40 nucleotides in length, in the range of 20-50 nucleotides inlength, in the range of 40-80 nucleotides in length, in the range of50-150 nucleotides in length, in the range of 80-120 nucleotides inlength, or are 60 nucleotides in length.

The probes may comprise DNA or DNA “mimics” (e.g., derivatives andanalogues) corresponding to a portion of an organism's genome. Inanother embodiment, the probes of the microarray are complementary RNAor RNA mimics. DNA mimics are polymers composed of subunits capable ofspecific, Watson-Crick-like hybridization with DNA, or of specifichybridization with RNA. The nucleic acids can be modified at the basemoiety, at the sugar moiety, or at the phosphate backbone (e.g.,phosphorothioates).

DNA can be obtained, e.g., by polymerase chain reaction (PCR)amplification of genomic DNA or cloned sequences. PCR primers arepreferably chosen based on a known sequence of the genome that willresult in amplification of specific fragments of genomic DNA. Computerprograms that are well known in the art are useful in the design ofprimers with the required specificity and optimal amplificationproperties, such as Oligo version 5.0 (National Biosciences). Typicallyeach probe on the microarray will be between 10 bases and 50,000 bases,usually between 300 bases and 1,000 bases in length. PCR methods arewell known in the art, and are described, for example, in Innis et al.,eds., PCR Protocols: A Guide To Methods And Applications, Academic PressInc., San Diego, Calif. (1990); herein incorporated by reference in itsentirety. It will be apparent to one skilled in the art that controlledrobotic systems are useful for isolating and amplifying nucleic acids.

An alternative, preferred means for generating polynucleotide probes isby synthesis of synthetic polynucleotides or oligonucleotides, e.g.,using N-phosphonate or phosphoramidite chemistries (Froehler et al.,Nucleic Acid Res. 14:5399-5407 (1986); McBride et al., Tetrahedron Lett.24:246-248 (1983)). Synthetic sequences are typically between about 10and about 500 bases in length, more typically between about 20 and about100 bases, and most preferably between about 40 and about 70 bases inlength. In some embodiments, synthetic nucleic acids include non-naturalbases, such as, but by no means limited to, inosine. As noted above,nucleic acid analogues may be used as binding sites for hybridization.An example of a suitable nucleic acid analogue is peptide nucleic acid(see, e.g., Egholm et al., Nature 363:566-568 (1993); U.S. Pat. No.5,539,083).

Probes are preferably selected using an algorithm that takes intoaccount binding energies, base composition, sequence complexity,cross-hybridization binding energies, and secondary structure. SeeFriend et al., International Patent Publication WO 01/05935, publishedJan. 25, 2001; Hughes et al., Nat. Biotech. 19:342-7 (2001).

A skilled artisan will also appreciate that positive control probes,e.g., probes known to be complementary and hybridizable to sequences inthe target polynucleotide molecules, and negative control probes, e.g.,probes known to not be complementary and hybridizable to sequences inthe target polynucleotide molecules, should be included on the array. Inone embodiment, positive controls are synthesized along the perimeter ofthe array. In another embodiment, positive controls are synthesized indiagonal stripes across the array. In still another embodiment, thereverse complement for each probe is synthesized next to the position ofthe probe to serve as a negative control. In yet another embodiment,sequences from other species of organism are used as negative controlsor as “spike-in” controls.

The probes are attached to a solid support or surface, which may bemade, e.g., from glass, plastic (e.g., polypropylene, nylon),polyacrylamide, nitrocellulose, gel, or other porous or nonporousmaterial. One method for attaching nucleic acids to a surface is byprinting on glass plates, as is described generally by Schena et al,Science 270:467-470 (1995). This method is especially useful forpreparing microarrays of cDNA (See also, DeRisi et al, Nature Genetics14:457-460 (1996); Shalon et al., Genome Res. 6:639-645 (1996); andSchena et al., Proc. Natl. Acad. Sci. U.S.A. 93:10539-11286 (1995);herein incorporated by reference in their entireties).

A second method for making microarrays produces high-densityoligonucleotide arrays. Techniques are known for producing arrayscontaining thousands of oligonucleotides complementary to definedsequences, at defined locations on a surface using photolithographictechniques for synthesis in situ (see, Fodor et al., 1991, Science251:767-773; Pease et al., 1994, Proc. Natl. Acad. Sci. U.S.A.91:5022-5026; Lockhart et al., 1996, Nature Biotechnology 14:1675; U.S.Pat. Nos. 5,578,832; 5,556,752; and 5,510,270; herein incorporated byreference in their entireties) or other methods for rapid synthesis anddeposition of defined oligonucleotides (Blanchard et al., Biosensors &Bioelectronics 11:687-690; herein incorporated by reference in itsentirety). When these methods are used, oligonucleotides (e.g., 60-mers)of known sequence are synthesized directly on a surface such as aderivatized glass slide. Usually, the array produced is redundant, withseveral oligonucleotide molecules per RNA.

Other methods for making microarrays, e.g., by masking (Maskos andSouthern, 1992, Nuc. Acids. Res. 20:1679-1684; herein incorporated byreference in its entirety), may also be used. In principle, any type ofarray, for example, dot blots on a nylon hybridization membrane (seeSambrook, et al., Molecular Cloning: A Laboratory Manual, 3rd Edition,2001) could be used. However, as will be recognized by those skilled inthe art, very small arrays will frequently be preferred becausehybridization volumes will be smaller.

Microarrays can also be manufactured by means of an ink jet printingdevice for oligonucleotide synthesis, e.g., using the methods andsystems described by Blanchard in U.S. Pat. No. 6,028,189; Blanchard etal., 1996, Biosensors and Bioelectronics 11:687-690; Blanchard, 1998, inSynthetic DNA Arrays in Genetic Engineering, Vol. 20, J. K. Setlow, Ed.,Plenum Press, New York at pages 111-123; herein incorporated byreference in their entireties. Specifically, the oligonucleotide probesin such microarrays are synthesized in arrays, e.g., on a glass slide,by serially depositing individual nucleotide bases in “microdroplets” ofa high surface tension solvent such as propylene carbonate. Themicrodroplets have small volumes (e.g., 100 μL or less, more preferably50 μL or less) and are separated from each other on the microarray(e.g., by hydrophobic domains) to form circular surface tension wellswhich define the locations of the array elements (i.e., the differentprobes). Microarrays manufactured by this ink jet method are typicallyof high density, preferably having a density of at least about 2,500different probes per 1 cm². The polynucleotide probes are attached tothe support covalently at either the 3′ or the 5′ end of thepolynucleotide.

Biomarker polynucleotides which may be measured by microarray analysiscan be expressed lncRNAs or a nucleic acid derived therefrom (e.g., cDNAor amplified RNA derived from cDNA that incorporates an RNA polymerasepromoter), including naturally occurring nucleic acid molecules, as wellas synthetic nucleic acid molecules. In one embodiment, the targetpolynucleotide molecules comprise RNA, including, but by no meanslimited to, total cellular RNA, lncRNA, poly(A)⁺ messenger RNA (mRNA) ora fraction thereof, cytoplasmic mRNA, or RNA transcribed from cDNA(i.e., cRNA; see, e.g., Linsley & Schelter, U.S. patent application Ser.No. 09/411,074, filed Oct. 4, 1999, or U.S. Pat. Nos. 5,545,522,5,891,636, or 5,716,785). Methods for preparing total and poly(A)⁺ RNAare well known in the art, and are described generally, e.g., inSambrook, et al., Molecular Cloning: A Laboratory Manual (3rd Edition,2001). RNA can be extracted from a cell of interest using guanidiniumthiocyanate lysis followed by CsCl centrifugation (Chirgwin et al.,1979, Biochemistry 18:5294-5299), a silica gel-based column (e.g.,RNeasy (Qiagen, Valencia, Calif.) or StrataPrep (Stratagene, La Jolla,Calif.)), or using phenol and chloroform, as described in Ausubel etal., eds., 1989, Current Protocols In Molecular Biology, Vol. III, GreenPublishing Associates, Inc., John Wiley & Sons, Inc., New York, at pp.13.12.1-13.12.5). Poly(A)⁺ RNA can be selected, e.g., by selection witholigo-dT cellulose or, alternatively, by oligo-dT primed reversetranscription of total cellular RNA. RNA can be fragmented by methodsknown in the art, e.g., by incubation with ZnCl₂, to generate fragmentsof RNA.

In one embodiment, total RNA, lncRNAs, or nucleic acids derivedtherefrom, are isolated from a sample taken from a patient undergoingcancer treatment, stem cell therapy, or regenerative medical treatment.Biomarker lncRNAs that are poorly expressed in particular cells may beenriched using normalization techniques (Bonaldo et al., 1996, GenomeRes. 6:791-806).

As described above, the biomarker polynucleotides can be detectablylabeled at one or more nucleotides. Any method known in the art may beused to label the target polynucleotides. Preferably, this labelingincorporates the label uniformly along the length of the RNA, and morepreferably, the labeling is carried out at a high degree of efficiency.For example, polynucleotides can be labeled by oligo-dT primed reversetranscription. Random primers (e.g., 9-mers) can be used in reversetranscription to uniformly incorporate labeled nucleotides over the fulllength of the polynucleotides. Alternatively, random primers may be usedin conjunction with PCR methods or T7 promoter-based in vitrotranscription methods in order to amplify polynucleotides.

The detectable label may be a luminescent label. For example,fluorescent labels, bioluminescent labels, chemiluminescent labels, andcolorimetric labels may be used in the practice of the invention.Fluorescent labels that can be used include, but are not limited to,fluorescein, a phosphor, a rhodamine, or a polymethine dye derivative.Additionally, commercially available fluorescent labels including, butnot limited to, fluorescent phosphoramidites such as FluorePrime(Amersham Pharmacia, Piscataway, N.J.), Fluoredite (Miilipore, Bedford,Mass.), FAM (ABI, Foster City, Calif.), and Cy3 or Cy5 (AmershamPharmacia, Piscataway, N.J.) can be used. Alternatively, the detectablelabel can be a radiolabeled nucleotide.

In one embodiment, biomarker polynucleotide molecules from a patientsample are labeled differentially from the corresponding polynucleotidemolecules of a reference sample. The reference can comprise lncRNAs froma normal biological sample (i.e., control sample, e.g., biopsy from asubject not having cancer, or untreated cells or tissue) or from areference biological sample, (e.g., sample from a subject having cancer,sample of cells or tissue at different stages of differentiation ortreatment).

Nucleic acid hybridization and wash conditions are chosen so that thetarget polynucleotide molecules specifically bind or specificallyhybridize to the complementary polynucleotide sequences of the array,preferably to a specific array site, wherein its complementary DNA islocated. Arrays containing double-stranded probe DNA situated thereonare preferably subjected to denaturing conditions to render the DNAsingle-stranded prior to contacting with the target polynucleotidemolecules. Arrays containing single-stranded probe DNA (e.g., syntheticoligodeoxyribonucleic acids) may need to be denatured prior tocontacting with the target polynucleotide molecules, e.g., to removehairpins or dimers which form due to self-complementary sequences.

Optimal hybridization conditions will depend on the length (e.g.,oligomer versus polynucleotide greater than 200 bases) and type (e.g.,RNA, or DNA) of probe and target nucleic acids. One of skill in the artwill appreciate that as the oligonucleotides become shorter, it maybecome necessary to adjust their length to achieve a relatively uniformmelting temperature for satisfactory hybridization results. Generalparameters for specific (i.e., stringent) hybridization conditions fornucleic acids are described in Sambrook, et al., Molecular Cloning: ALaboratory Manual (3rd Edition, 2001), and in Ausubel et al., CurrentProtocols In Molecular Biology, vol. 2, Current Protocols Publishing,New York (1994). Typical hybridization conditions for the cDNAmicroarrays of Schena et al. are hybridization in 5.times.SSC plus 0.2%SDS at 65° C. for four hours, followed by washes at 25° C. in lowstringency wash buffer (1×SSC plus 0.2% SDS), followed by 10 minutes at25° C. in higher stringency wash buffer (0.1×SSC plus 0.2% SDS) (Schenaet al., Proc. Natl. Acad. Sci. U.S.A. 93:10614 (1993)). Usefulhybridization conditions are also provided in, e.g., Tijessen, 1993,Hybridization with Nucleic Acid Probes, Elsevier Science Publishers B.V.; and Kricka, 1992, Nonisotopic Dna Probe Techniques, Academic Press,San Diego, Calif. Particularly preferred hybridization conditionsinclude hybridization at a temperature at or near the mean meltingtemperature of the probes (e.g., within 51° C., more preferably within21° C.) in 1 M NaCl, 50 mM MES buffer (pH 6.5), 0.5% sodium sarcosineand 30% formamide.

When fluorescently labeled gene products are used, the fluorescenceemissions at each site of a microarray may be, preferably, detected byscanning confocal laser microscopy. In one embodiment, a separate scan,using the appropriate excitation line, is carried out for each of thetwo fluorophores used. Alternatively, a laser may be used that allowssimultaneous specimen illumination at wavelengths specific to the twofluorophores and emissions from the two fluorophores can be analyzedsimultaneously (see Shalon et al., 1996, “A DNA microarray system foranalyzing complex DNA samples using two-color fluorescent probehybridization,” Genome Research 6:639-645, which is incorporated byreference in its entirety for all purposes). Arrays can be scanned witha laser fluorescent scanner with a computer controlled X-Y stage and amicroscope objective. Sequential excitation of the two fluorophores isachieved with a multi-line, mixed gas laser and the emitted light issplit by wavelength and detected with two photomultiplier tubes.Fluorescence laser scanning devices are described in Schena et al.,Genome Res. 6:639-645 (1996), and in other references cited herein.Alternatively, the fiber-optic bundle described by Ferguson et al.,Nature Biotech. 14:1681-1684 (1996), may be used to monitor mRNAabundance levels at a large number of sites simultaneously.

In one embodiment, the invention includes a microarray comprising aplurality of probes that hybridize to one or more lncRNAs selected fromthe group consisting of int:CDK6:143, dst:CDKN2A:43877, upst:CCNF:−1721,upst:CCNI:−6398, upst:CCNI:−6621, upst:CCNI:−6883, upst:CDKN1A:−4845,upst:CDK5R1:−4044, upst:CDK5R1:−4410, upst:CCNL2:−1391,upst:CCNL2:−2253, upst:CCNL2:−767, int:CDKN2D:1417, upst:CCNL2:−5540,int:CDKN1A:1420, int:CCNT1:602, upst:CCNL2:−3110, upst:CDK5R1:−5717,upst:CCNL2:−982, upst:CCNE2:−682, int:CDK5R1:183, upst:CDK5R1:−482,upst:CDK8:−798, upst:CDK9:−646, upst:CDK6:−1860, int:CDK6:1276,upst:CDK6:−533, upst:CDKN2C:−8037, upst:CCNK:−899, upst:CNNM3:−248,upst:CDKN1C:−4619, int:CDKN2A:6667, int:ARF:4530, upst:CDKN2B:−15913,upst:CDK6:−1679, upst:CDKN1A:−1210, int:CDKN2B:1926, dst:CDKN2A:39498,upst:CCNL1:−1968, upst:CCNL1:−2234, upst:CCNL1:−2383, upst:CCNL1:−2767,upst:CDK5R2:−6418, upst:CDK4:−7794, upst:CDKN1A:−5830, int:CDKN2C:159,upst:CCNYL2:−36, upst:CCNC:−6760, upst:CDKN2B:−2817, upst:CNNM3:−970,upst:CDK5R2:−6045, upst:CDKN1C:−2196, int:CCND1:874, int:CCND2:1205,upst:CDKN1C:−446, int:CCNG2:390, upst:CDK3:−4148, upst:CCNA2:−250,int:CDKL5:64, upst:CCND2: 3165, int:CCNK:210, int:CDKN1A:885,upst:CDK5R2:−9197, int:CNNM3:1459, upst:CCND1:−1659, int:CCNL2:463,upst:CCNE1:−1190, upst:CDK5R2:−8037, upst:CDKL3:−867, int:CCNG1:381,upst:CCND2:−2874, upst:CDKN2B:−130736, int:CCNI:1042, upst:CCND2:−4757,int:CDK9:352, int:CCND2:1689, int:CDKL5:1682, upst:CDK5R2:−4541,upst:CDK5:−7855, upst:CDK9:−1536, upst:CCND2:−1291, upst:CCND1:−377,int:CCNL1:1097, upst:CDK5R2:−648, upst:CCNL2:−7336, upst:CCND1:−2768,upst:CDK2:−1390, upst:CCNYL3:−8181, dst:CDKN2A:8650, upst:CDK8:−265,upst:CDK4:−4462, upst:CDKN2A:−44, int:CDKN2A:5270, upst:CCNJL:−2749,upst:CNNM4:−1843, upst:CDK5R2:−7376, int:CCNO:1417, upst:CDKN1C:−5,upst:CDKN1C:−6280, upst:ARF:−840, upst:CCND2:−1830, upst:CDK5R1:−206,upst:CCNA1:−1163, int:CCNE2:647, upst:CDK9:−909, upst:CCNYL3:−293,upst:CDKN3:−271, int:CCNT2:640, upst:CCND1:−2574, upst:CCNT2:−319,upst:CDK5R1:−3023, upst:CDK9:−3159, upst:CDK9:−8667, upst:CCNE2:−4956,int:CCND3:2384, upst:CDKN1B:−1362, upst:CCNI:−7899, upst:CCNT2:−6751,int:CDK5:1993, upst:CDK9:−8509, upst:CCND1:−7190, upst:CDKN1C:−7144,upst:CDKN3:−4479, upst:CCNB3:−3258, upst:CCND3:−9303, upst:CDK8:−8337,int:CDKN2C:643, upst:CCNYL3:−1019, upst:CDK5:−2373, int:CNNM4:1658,upst:CCNE2:−8552, upst:CCNG1:−9141, upst:CCND2:−4886, upst:CCNK:−8357,upst:CDK5:−9105, upst:CDKN2B:−108997, int:CCNB2:547, upst:CDKN3:−2291,dst:CDKN2A:30203, upst:CDK2:−5210, upst:CCNL1:−3430, upst:CCNF:−3964,upst:CCNK:−4426, upst:CCNF:−3743, upst:CDK5:−3754, upst:CDKN2B:−35359,upst:CDKN2B:−87467, upst:CDK5R2:−4915, upst:CCNF:−2075, upst:CDK6:−8726,upst:CDKN2B:−90566, int:CDKN2A:4904, int:CDKN2A:4432, upst:ARF:−2148,upst:CDKN2B:−130339, upst:CNNM3:−9238, upst:CCNG1:−4532, int:ARF:15754,upst:CCNF:−1085, upst:CDKN2B:−23831, upst:CDKN1A:−9569, int:CCNI:1874,dst:CDKN2A:45866, int:CCNC:816, upst:CCNC:−5405, upst:CDK4:−1632,upst:CCNK:−3241, upst:CDK10:−1805, upst:CCNJL:−671, upst:CDKN3:−5723,upst:CDKN2B:−15114, upst:CCNE2:−5939, upst:CCNJL:−7299,upst:CCND3:−4248, int:CDK9:1811, upst:CDKN2C:−8538, upst:ARF:−1395,upst:CCND2:−6904, upst:CDK4:−977, upst:CCNE1:−5422, upst:CCNE2:−2828,upst:CDK4:−2133, upst:CDK8:−9630, upst:CDK3:−4497, upst:CCND3:−6423,upst:CCND1:−8918, upst:CDKN2B:−119804, upst:CDKN3:−5438,upst:CDKN2C:−7397, upst:CCNYL1:−3709, upst:CDKL4:−6205,upst:CDKN1A:−2237, upst:CDKN1C:−4093, upst:CCND2:−9042, int:CDK8:566,upst:CDKN2B:−804, upst:CCNE1:−4445, upst:CDKN2B:−74328,upst:CDKN2B:−53107, upst:CCNE1:−9426, upst:CDKN2C:−3161,upst:CCNG2:−2953, upst:CNNM1:−2645, upst:CDKN1A:−1902, upst:CDKN3:−1974,upst:CDK10:−4173, upst:CDK9:−9782, upst:CDKN1C:−5693, upst:CDK5:−9871,upst:CNNM4:−4755, upst:CDKN2B:−31120, upst:CDK2:−8040,upst:CDKN2B:−75214, upst:CDKN2C:−127, upst:CDKN1C:−1017, andupst:CNNM4:−3840.

In another embodiment, the invention includes a microarray comprising aplurality of probes that hybridize to int:CDK6:1276, upst:CDK6:−533,upst:CDKN2C:−8037, int:CDKN2D:1417, upst:CCNL2:−5540, int:CDKN1A:1420,int:CCNT1:602, upst:CCNL2:−3110, upst:CDK5R1:−5717, upst:CCNL2:−982,upst:CCNE2:−682, int:CDK5R1:183, upst:CDK5R1:−482, upst:CDK8:−798,upst:CDK9:−646, upst:CDK6:−1860, int:CDKN2B:1926, int:CDK6:143,dst:CDKN2A:43877, upst:CCNF:−1721, upst:CCNI:−6398, upst:CCNI:−6621,upst:CCNI:−6883, upst:CDKN1A:−4845, upst:CDK5R1:−4044,upst:CDK5R1:−4410, upst:CCNL2:−1391, upst:CCNL2:−2253, upst:CCNL2:−767,dst:CDKN2A:39498, upst:CCNL1:−1968, upst:CCNL1:−2234, upst:CCNL1:−2383,upst:CCNL1:−2767, upst:CDK5R2:−6418, upst:CDK4:−7794, upst:CDKN1A:−5830,int:CDKN2C:159, upst:CCNYL2:−36, upst:CCNC:−6760, upst:CDKN2B:−2817,upst:CNNM3:−970, upst:CDK5R2:−6045, and upst:CDKN1C:−2196.

In another embodiment, the invention includes a microarray comprising aplurality of probes that hybridize to upst:CCNG2:−2953, upst: CDKN1A:−4845, upst: CDKN1A: −9569, upst:CCNL1:−2767, int:CCNG1:+381,upst:CDK9:−9782, int:ARF:+4530, and upst:CDKN1C:−1017.

In another embodiment, the invention includes a microarray comprising aplurality of probes that hybridize to upst:CCNL1:−2767, int:CDKN1A:+885,upst: CDKN1A: −4845, upst:CDKN2B:−2,817, upst:CDK9:−9782,int:ARF:+4,517, int:ARF:+4530, upst:CDKN1C:−1017, int:CCNG1:+381, andupst:CCNG2:−2953.

Polynucleotides can also be analyzed by other methods including, but notlimited to, northern blotting, nuclease protection assays, RNAfingerprinting, polymerase chain reaction, ligase chain reaction, Qbetareplicase, isothermal amplification method, strand displacementamplification, transcription based amplification systems, nucleaseprotection (S1 nuclease or RNAse protection assays), SAGE as well asmethods disclosed in International Publication Nos. WO 88/10315 and WO89/06700, and International Applications Nos. PCT/US87/00880 andPCT/US89/01025; herein incorporated by reference in their entireties.

A standard Northern blot assay can be used to ascertain an RNAtranscript size, identify alternatively spliced RNA transcripts, and therelative amounts of mRNA or lncRNA in a sample, in accordance withconventional Northern hybridization techniques known to those persons ofordinary skill in the art. In Northern blots, RNA samples are firstseparated by size by electrophoresis in an agarose gel under denaturingconditions. The RNA is then transferred to a membrane, cross-linked, andhybridized with a labeled probe. Nonisotopic or high specific activityradiolabeled probes can be used, including random-primed,nick-translated, or PCR-generated DNA probes, in vitro transcribed RNAprobes, and oligonucleotides. Additionally, sequences with only partialhomology (e.g., cDNA from a different species or genomic DNA fragmentsthat might contain an exon) may be used as probes. The labeled probe,e.g., a radiolabelled cDNA, either containing the full-length, singlestranded DNA or a fragment of that DNA sequence may be at least 20, atleast 30, at least 50, or at least 100 consecutive nucleotides inlength. The probe can be labeled by any of the many different methodsknown to those skilled in this art. The labels most commonly employedfor these studies are radioactive elements, enzymes, chemicals thatfluoresce when exposed to ultraviolet light, and others. A number offluorescent materials are known and can be utilized as labels. Theseinclude, but are not limited to, fluorescein, rhodamine, auramine, TexasRed, AMCA blue and Lucifer Yellow. A particular detecting material isanti-rabbit antibody prepared in goats and conjugated with fluoresceinthrough an isothiocyanate. Proteins can also be labeled with aradioactive element or with an enzyme. The radioactive label can bedetected by any of the currently available counting procedures. Isotopesthat can be used include, but are not limited to, ³H, ¹⁴C, ³²P, ³⁵S,³⁶Cl, ³⁵Cr, ⁵⁷Co, ⁵⁸Co, ⁵⁹Fe, ⁹⁰Y, ¹²⁵I, ¹³¹I, and ¹⁸⁶Re. Enzyme labelsare likewise useful, and can be detected by any of the presentlyutilized colorimetric, spectrophotometric, fluorospectrophotometric,amperometric or gasometric techniques. The enzyme is conjugated to theselected particle by reaction with bridging molecules such ascarbodiimides, diisocyanates, glutaraldehyde and the like. Any enzymesknown to one of skill in the art can be utilized. Examples of suchenzymes include, but are not limited to, peroxidase,beta-D-galactosidase, urease, glucose oxidase plus peroxidase andalkaline phosphatase. U.S. Pat. Nos. 3,654,090, 3,850,752, and 4,016,043are referred to by way of example for their disclosure of alternatelabeling material and methods.

Nuclease protection assays (including both ribonuclease protectionassays and S1 nuclease assays) can be used to detect and quantitatespecific mRNAs and lncRNAs. In nuclease protection assays, an antisenseprobe (labeled with, e.g., radiolabeled or nonisotopic) hybridizes insolution to an RNA sample. Following hybridization, single-stranded,unhybridized probe and RNA are degraded by nucleases. An acrylamide gelis used to separate the remaining protected fragments. Typically,solution hybridization is more efficient than membrane-basedhybridization, and it can accommodate up to 100 μg of sample RNA,compared with the 20-30 μg maximum of blot hybridizations.

The ribonuclease protection assay, which is the most common type ofnuclease protection assay, requires the use of RNA probes.Oligonucleotides and other single-stranded DNA probes can only be usedin assays containing S1 nuclease. The single-stranded, antisense probemust typically be completely homologous to target RNA to preventcleavage of the probe:target hybrid by nuclease.

Serial Analysis Gene Expression (SAGE), can also be used to determineRNA (e.g., lncRNA) abundances in a cell sample. See, e.g., Velculescu etal., 1995, Science 270:484-7; Carulli, et al., 1998, Journal of CellularBiochemistry Supplements 30/31:286-96; herein incorporated by referencein their entireties. SAGE analysis does not require a special device fordetection, and is one of the preferable analytical methods forsimultaneously detecting the expression of a large number oftranscription products. First, RNA is extracted from cells. Next, theRNA is converted into cDNA using a biotinylated oligo (dT) primer, andtreated with a four-base recognizing restriction enzyme (AnchoringEnzyme: AE) resulting in AE-treated fragments containing a biotin groupat their 3′ terminus. Next, the AE-treated fragments are incubated withstreptoavidin for binding. The bound cDNA is divided into two fractions,and each fraction is then linked to a different double-strandedoligonucleotide adapter (linker) A or B. These linkers are composed of:(1) a protruding single strand portion having a sequence complementaryto the sequence of the protruding portion formed by the action of theanchoring enzyme, (2) a 5′ nucleotide recognizing sequence of theIIS-type restriction enzyme (cleaves at a predetermined location no morethan 20 bp away from the recognition site) serving as a tagging enzyme(TE), and (3) an additional sequence of sufficient length forconstructing a PCR-specific primer. The linker-linked cDNA is cleavedusing the tagging enzyme, and only the linker-linked cDNA sequenceportion remains, which is present in the form of a short-strand sequencetag. Next, pools of short-strand sequence tags from the two differenttypes of linkers are linked to each other, followed by PCR amplificationusing primers specific to linkers A and B. As a result, theamplification product is obtained as a mixture comprising myriadsequences of two adjacent sequence tags (ditags) bound to linkers A andB. The amplification product is treated with the anchoring enzyme, andthe free ditag portions are linked into strands in a standard linkagereaction. The amplification product is then cloned. Determination of theclone's nucleotide sequence can be used to obtain a read-out ofconsecutive ditags of constant length. The presence of mRNAcorresponding to each tag can then be identified from the nucleotidesequence of the clone and information on the sequence tags.

Quantitative reverse transcriptase PCR (qRT-PCR) can also be used todetermine the expression profiles of biomarkers (see, e.g., U.S. PatentApplication Publication No. 2005/0048542A1; herein incorporated byreference in its entirety). The first step in gene expression profilingby RT-PCR is the reverse transcription of the RNA template into cDNA,followed by its exponential amplification in a PCR reaction. The twomost commonly used reverse transcriptases are avilo myeloblastosis virusreverse transcriptase (AMV-RT) and Moloney murine leukemia virus reversetranscriptase (MLV-RT). The reverse transcription step is typicallyprimed using specific primers, random hexamers, or oligo-dT primers,depending on the circumstances and the goal of expression profiling. Forexample, extracted RNA can be reverse-transcribed using a GeneAmp RNAPCR kit (Perkin Elmer, Calif., USA), following the manufacturer'sinstructions. The derived cDNA can then be used as a template in thesubsequent PCR reaction.

Although the PCR step can use a variety of thermostable DNA-dependentDNA polymerases, it typically employs the Taq DNA polymerase, which hasa 5′-3′ nuclease activity but lacks a 3′-5′ proofreading endonucleaseactivity. Thus, TAQMAN PCR typically utilizes the 5′-nuclease activityof Taq or Tth polymerase to hydrolyze a hybridization probe bound to itstarget amplicon, but any enzyme with equivalent 5′ nuclease activity canbe used. Two oligonucleotide primers are used to generate an amplicontypical of a PCR reaction. A third oligonucleotide, or probe, isdesigned to detect nucleotide sequence located between the two PCRprimers. The probe is non-extendible by Taq DNA polymerase enzyme, andis labeled with a reporter fluorescent dye and a quencher fluorescentdye. Any laser-induced emission from the reporter dye is quenched by thequenching dye when the two dyes are located close together as they areon the probe. During the amplification reaction, the Taq DNA polymeraseenzyme cleaves the probe in a template-dependent manner. The resultantprobe fragments disassociate in solution, and signal from the releasedreporter dye is free from the quenching effect of the secondfluorophore. One molecule of reporter dye is liberated for each newmolecule synthesized, and detection of the unquenched reporter dyeprovides the basis for quantitative interpretation of the data.

TAQMAN RT-PCR can be performed using commercially available equipment,such as, for example, ABI PRISM 7700 sequence detection system.(Perkin-Elmer-Applied Biosystems, Foster City, Calif., USA), orLightcycler (Roche Molecular Biochemicals, Mannheim, Germany). In apreferred embodiment, the 5′ nuclease procedure is run on a real-timequantitative PCR device such as the ABI PRISM 7700 sequence detectionsystem. The system consists of a thermocycler, laser, charge-coupleddevice (CCD), camera and computer. The system includes software forrunning the instrument and for analyzing the data. 5′-Nuclease assaydata are initially expressed as Ct, or the threshold cycle. Fluorescencevalues are recorded during every cycle and represent the amount ofproduct amplified to that point in the amplification reaction. The pointwhen the fluorescent signal is first recorded as statisticallysignificant is the threshold cycle (Ct).

To minimize errors and the effect of sample-to-sample variation, RT-PCRis usually performed using an internal standard. The ideal internalstandard is expressed at a constant level among different tissues, andis unaffected by the experimental treatment. RNAs most frequently usedto normalize patterns of gene expression are mRNAs for the housekeepinggenes glyceraldehyde-3-phosphate-dehydrogenase (GAPDH) and beta-actin.

A more recent variation of the RT-PCR technique is the real timequantitative PCR, which measures PCR product accumulation through adual-labeled fluorigenic probe (i.e., TAQMAN probe). Real time PCR iscompatible both with quantitative competitive PCR, where internalcompetitor for each target sequence is used for normalization, and withquantitative comparative PCR using a normalization gene contained withinthe sample, or a housekeeping gene for RT-PCR. For further details see,e.g. Held et al., Genome Research 6:986-994 (1996).

Mass spectrometry, and particularly SELDI mass spectrometry, is aparticularly useful method for detection of the biomarkers of thisinvention. Laser desorption time-of-flight mass spectrometer can be usedin embodiments of the invention. In laser desorption mass spectrometry,a substrate or a probe comprising biomarkers is introduced into an inletsystem. The biomarkers are desorbed and ionized into the gas phase bylaser from the ionization source. The ions generated are collected by anion optic assembly, and then in a time-of-flight mass analyzer, ions areaccelerated through a short high voltage field and let drift into a highvacuum chamber. At the far end of the high vacuum chamber, theaccelerated ions strike a sensitive detector surface at a differenttime. Since the time-of-flight is a function of the mass of the ions,the elapsed time between ion formation and ion detector impact can beused to identify the presence or absence of markers of specific mass tocharge ratio.

Matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS)can also be used for detecting the biomarkers of this invention.MALDI-MS is a method of mass spectrometry that involves the use of anenergy absorbing molecule, frequently called a matrix, for desorbingproteins intact from a probe surface. MALDI is described, for example,in U.S. Pat. No. 5,118,937 (Hillenkamp et al.) and U.S. Pat. No.5,045,694 (Beavis and Chait). In MALDI-MS, the sample is typically mixedwith a matrix material and placed on the surface of an inert probe.Exemplary energy absorbing molecules include cinnamic acid derivatives,sinapinic acid (“SPA”), cyano hydroxy cinnamic acid (“CHCA”) anddihydroxybenzoic acid. Other suitable energy absorbing molecules areknown to those skilled in this art. The matrix dries, forming crystalsthat encapsulate the analyte molecules. Then the analyte molecules aredetected by laser desorption/ionization mass spectrometry.

Surface-enhanced laser desorption/ionization mass spectrometry, orSELDI-MS represents an improvement over MALDI for the fractionation anddetection of biomolecules, such as lncRNAs, in complex mixtures. SELDIis a method of mass spectrometry in which biomolecules, such as lncRNAs,are captured on the surface of a biochip using capture reagents that arebound there. Typically, non-bound molecules are washed from the probesurface before interrogation. SELDI is described, for example, in: U.S.Pat. No. 5,719,060 (“Method and Apparatus for Desorption and Ionizationof Analytes,” Hutchens and Yip, Feb. 17, 1998,) U.S. Pat. No. 6,225,047(“Use of Retentate Chromatography to Generate Difference Maps,” Hutchensand Yip, May 1, 2001) and Weinberger et al., “Time-of-flight massspectrometry,” in Encyclopedia of Analytical Chemistry, R. A. Meyers,ed., pp 11915-11918 John Wiley & Sons Chichesher, 2000.

Biomarkers on the substrate surface can be desorbed and ionized usinggas phase ion spectrometry. Any suitable gas phase ion spectrometer canbe used as long as it allows biomarkers on the substrate to be resolved.Preferably, gas phase ion spectrometers allow quantitation ofbiomarkers. In one embodiment, a gas phase ion spectrometer is a massspectrometer. In a typical mass spectrometer, a substrate or a probecomprising biomarkers on its surface is introduced into an inlet systemof the mass spectrometer. The biomarkers are then desorbed by adesorption source such as a laser, fast atom bombardment, high energyplasma, electrospray ionization, thermospray ionization, liquidsecondary ion MS, field desorption, etc. The generated desorbed,volatilized species consist of preformed ions or neutrals which areionized as a direct consequence of the desorption event. Generated ionsare collected by an ion optic assembly, and then a mass analyzerdisperses and analyzes the passing ions. The ions exiting the massanalyzer are detected by a detector. The detector then translatesinformation of the detected ions into mass-to-charge ratios. Detectionof the presence of biomarkers or other substances will typically involvedetection of signal intensity. This, in turn, can reflect the quantityand character of biomarkers bound to the substrate. Any of thecomponents of a mass spectrometer (e.g., a desorption source, a massanalyzer, a detector, etc.) can be combined with other suitablecomponents described herein or others known in the art in embodiments ofthe invention.

Biomarkers can also be detected with assays based on the use ofantibodies that specifically recognize the lncRNA biomarkers orpolynucleotide or oligonucleotide fragments of the biomarkers. Suchassays include, but are not limited to, immunohistochemistry (1HC),enzyme-linked immunosorbent assay (ELISA), radioimmunoassays (RIA),“sandwich” immunoassays, fluorescent immunoassays, immunoprecipitationassays, the procedures of which are well known in the art (see, e.g.,Ausubel et al, eds, 1994, Current Protocols in Molecular Biology, Vol.1, John Wiley & Sons, Inc., New York, which is incorporated by referenceherein in its entirety).

Antibodies that specifically bind to a biomarker can be prepared usingany suitable methods known in the art. See, e.g., Coligan, CurrentProtocols in Immunology (1991); Harlow & Lane, Antibodies: A LaboratoryManual (1988); Goding, Monoclonal Antibodies: Principles and Practice(2d ed. 1986); and Kohler & Milstein, Nature 256:495-497 (1975). Abiomarker antigen can be used to immunize a mammal, such as a mouse,rat, rabbit, guinea pig, monkey, or human, to produce polyclonalantibodies. If desired, a biomarker antigen can be conjugated to acarrier protein, such as bovine serum albumin, thyroglobulin, andkeyhole limpet hemocyanin. Depending on the host species, variousadjuvants can be used to increase the immunological response. Suchadjuvants include, but are not limited to, Freund's adjuvant, mineralgels (e.g., aluminum hydroxide), and surface active substances (e.g.lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions,keyhole limpet hemocyanin, and dinitrophenol). Among adjuvants used inhumans, BCG (bacilli Calmette-Guerin) and Corynebacterium parvum areespecially useful.

Monoclonal antibodies which specifically bind to a biomarker antigen canbe prepared using any technique which provides for the production ofantibody molecules by continuous cell lines in culture. These techniquesinclude, but are not limited to, the hybridoma technique, the human Bcell hybridoma technique, and the EBV hybridoma technique (Kohler etal., Nature 256, 495-97, 1985; Kozbor et al., J. Immunol. Methods 81,3142, 1985; Cote et al., Proc. Natl. Acad. Sci. 80, 2026-30, 1983; Coleet al., Mol. Cell. Biol. 62, 109-20, 1984).

In addition, techniques developed for the production of “chimericantibodies,” the splicing of mouse antibody genes to human antibodygenes to obtain a molecule with appropriate antigen specificity andbiological activity, can be used (Morrison et al., Proc. Natl. Acad.Sci. 81, 6851-55, 1984; Neuberger et al., Nature 312, 604-08, 1984;Takeda et al., Nature 314, 452-54, 1985). Monoclonal and otherantibodies also can be “humanized” to prevent a patient from mounting animmune response against the antibody when it is used therapeutically.Such antibodies may be sufficiently similar in sequence to humanantibodies to be used directly in therapy or may require alteration of afew key residues. Sequence differences between rodent antibodies andhuman sequences can be minimized by replacing residues which differ fromthose in the human sequences by site directed mutagenesis of individualresidues or by grating of entire complementarity determining regions.

Alternatively, humanized antibodies can be produced using recombinantmethods, as described below. Antibodies which specifically bind to aparticular antigen can contain antigen binding sites which are eitherpartially or fully humanized, as disclosed in U.S. Pat. No. 5,565,332.Human monoclonal antibodies can be prepared in vitro as described inSimmons et al., PLoS Medicine 4(5), 928-36, 2007.

Alternatively, techniques described for the production of single chainantibodies can be adapted using methods known in the art to producesingle chain antibodies which specifically bind to a particular antigen.Antibodies with related specificity, but of distinct idiotypiccomposition, can be generated by chain shuffling from randomcombinatorial immunoglobulin libraries (Burton, Proc. Natl. Acad. Sci.88, 11120-23, 1991).

Single-chain antibodies also can be constructed using a DNAamplification method, such as PCR, using hybridoma cDNA as a template(Thirion et al., Eur. J. Cancer Prey. 5, 507-11, 1996). Single-chainantibodies can be mono- or bispecific, and can be bivalent ortetravalent. Construction of tetravalent, bispecific single-chainantibodies is taught, for example, in Coloma & Morrison, Nat.Biotechnol. 15, 159-63, 1997. Construction of bivalent, bispecificsingle-chain antibodies is taught in Mallender & Voss, J. Biol. Chem.269, 199-206, 1994.

A nucleotide sequence encoding a single-chain antibody can beconstructed using manual or automated nucleotide synthesis, cloned intoan expression construct using standard recombinant DNA methods, andintroduced into a cell to express the coding sequence, as describedbelow. Alternatively, single-chain antibodies can be produced directlyusing, for example, filamentous phage technology (Verhaar et al., Int.J. Cancer 61, 497-501, 1995; Nicholls et al., J. Immunol. Meth. 165,81-91, 1993).

Antibodies which specifically bind to a biomarker antigen also can beproduced by inducing in vivo production in the lymphocyte population orby screening immunoglobulin libraries or panels of highly specificbinding reagents as disclosed in the literature (Orlandi et al., Proc.Natl. Acad. Sci. 86, 3833 3837, 1989; Winter et al., Nature 349, 293299, 1991).

Chimeric antibodies can be constructed as disclosed in WO 93/03151.Binding proteins which are derived from immunoglobulins and which aremultivalent and multispecific, such as the “diabodies” described in WO94/13804, also can be prepared.

Antibodies can be purified by methods well known in the art. Forexample, antibodies can be affinity purified by passage over a column towhich the relevant antigen is bound. The bound antibodies can then beeluted from the column using a buffer with a high salt concentration.

Antibodies may be used in diagnostic assays to detect the presence orfor quantification of the biomarkers in a biological sample. Such adiagnostic assay may comprise at least two steps; (i) contacting abiological sample with the antibody, wherein the sample is a tissue(e.g., human, animal, etc.), cell (e.g., stem cell), biological fluid(e.g., blood, urine, sputum, semen, amniotic fluid, saliva, etc.),biological extract (e.g., tissue or cellular homogenate, etc.), or achromatography column, etc; and (ii) quantifying the antibody bound tothe substrate. The method may additionally involve a preliminary step ofattaching the antibody, either covalently, electrostatically, orreversibly, to a solid support, before subjecting the bound antibody tothe sample, as defined above and elsewhere herein.

Various diagnostic assay techniques are known in the art, such ascompetitive binding assays, direct or indirect sandwich assays andimmunoprecipitation assays conducted in either heterogeneous orhomogenous phases (Zola, Monoclonal Antibodies: A Manual of Techniques,CRC Press, Inc., (1987), pp 147-158). The antibodies used in thediagnostic assays can be labeled with a detectable moiety. Thedetectable moiety should be capable of producing, either directly orindirectly, a detectable signal. For example, the detectable moiety maybe a radioisotope, such as ²H, ¹⁴C, ³²P, or ¹²⁵I, a fluorescent orchemiluminescent compound, such as fluorescein isothiocyanate,rhodamine, or luciferin, or an enzyme, such as alkaline phosphatase,beta-galactosidase, green fluorescent protein, or horseradishperoxidase. Any method known in the art for conjugating the antibody tothe detectable moiety may be employed, including those methods describedby Hunter et al., Nature, 144:945 (1962); David et al., Biochem.,13:1014 (1974); Pain et al., J. Immunol. Methods, 40:219 (1981); andNygren, J. Histochem. and Cytochem., 30:407 (1982).

Immunoassays can be used to determine the presence or absence of abiomarker in a sample as well as the quantity of a biomarker in asample. First, a test amount of a biomarker in a sample can be detectedusing the immunoassay methods described above. If a biomarker is presentin the sample, it will form an antibody-biomarker complex with anantibody that specifically binds the biomarker under suitable incubationconditions, as described above. The amount of an antibody-biomarkercomplex can be determined by comparing to a standard. A standard can be,e.g., a known compound or another lncRNA known to be present in asample. As noted above, the test amount of a biomarker need not bemeasured in absolute units, as long as the unit of measurement can becompared to a control.

Kits

In yet another aspect, the invention provides kits for use in diagnosingcancer or monitoring stem cell therapy or regenerative medicaltreatments, wherein the kits can be used to detect the lncRNA biomarkersof the present invention. For example, the kits can be used to detectany one or more of the biomarkers described herein, which aredifferentially expressed in samples of a patient with cancer, orundergoing stem cell therapy, or regenerative medical treatment andnormal subjects. The kit may include one or more agents for detection oflncRNA biomarkers, a container for holding a biological sample isolatedfrom a human subject; and printed instructions for reacting agents withthe biological sample or a portion of the biological sample to detectthe presence or amount of at least one lncRNA biomarker in thebiological sample. The agents may be packaged in separate containers.The kit may further comprise one or more control reference samples andreagents for performing an immunoassay, a Northern blot, PCR, microarrayanalysis, or SAGE.

In certain embodiments, the kit contains at least one probe thatselectively hybridizes to a biomarker, or at least one antibody thatselectively binds to a biomarker, or at least one set of PCR primers foramplifying a biomarker. In one embodiment, the kit comprises at leastone agent for measuring the level of PANDA.

The kit can comprise one or more containers for compositions containedin the kit. Compositions can be in liquid form or can be lyophilized.Suitable containers for the compositions include, for example, bottles,vials, syringes, and test tubes. Containers can be formed from a varietyof materials, including glass or plastic. The kit can also comprise apackage insert containing written instructions for methods of diagnosingcancer or monitoring stem cell therapy or regenerative medicaltreatments.

The kits of the invention have a number of applications. For example,the kits can be used for monitoring cell proliferation anddifferentiation during cancer progression, tissue regeneration, orgrowth of human cells, tissues, or organs in culture for tissue or organreplacement. In another example, the kits can be used for evaluating theefficacy of a treatment for cancer, stem cell therapy, or regenerativemedicine. In a further example, the kits can be used to identifycompounds that modulate expression of one or more of the biomarkers inin vitro or in vivo animal models to determine the effects of treatment.

C. PANDA and Inhibitors

In another aspect, an inhibitor of PANDA is used in the practice of theinvention. Inhibitors of PANDA can include, but are not limited to,antisense oligonucleotides, inhibitory RNA molecules, such as miRNAs,siRNAs, piRNAs, and snRNAs, ribozymes, and small molecule inhibitors.Various types of inhibitors for inhibiting nucleic acid function arewell known in the art. See e.g., International patent applicationWO/2012/018881; U.S. patent application 2011/0251261; U.S. Pat. No.6,713,457; Kole et al. (2012) Nat. Rev. Drug Discov. 11(2):125-40;Sanghvi (2011) Curr. Protoc. Nucleic Acid Chem. Chapter 4:Unit 4.1.1-22;herein incorporated by reference in their entireties.

Inhibitors can be single stranded or double stranded polynucleotides andmay contain one or more chemical modifications, such as, but not limitedto, locked nucleic acids, peptide nucleic acids, sugar modifications,such as 2′-O-alkyl (e.g., 2′-O-methyl, 2′-O-methoxyethyl), 2′-fluoro,and 4′-thio modifications, and backbone modifications, such as one ormore phosphorothioate, morpholino, or phosphonocarboxylate linkages. Inaddition, inhibitory RNA molecules may have a “tail” covalently attachedto their 3′- and/or 5′-end, which may be used to stabilize the RNAinhibitory molecule or enhance cellular uptake. Such tails include, butare not limited to, intercalating groups, various kinds of reportergroups, and lipophilic groups attached to the 3′ or 5′ ends of the RNAmolecules. In certain embodiments, the RNA inhibitory molecule isconjugated to cholesterol or acridine. See, for example, the followingfor descriptions of syntheses of 3′-cholesterol or 3′-acridine modifiedoligonucleotides: Gamper, H. B., Reed, M. W., Cox, T., Virosco, J. S.,Adams, A. D., Gall, A., Scholler, J. K., and Meyer, R. B. (1993) FacilePreparation and Exonuclease Stability of 3′-ModifiedOligodeoxynucleotides. Nucleic Acids Res. 21 145-150; and Reed, M. W.,Adams, A. D., Nelson, J. S., and Meyer, R. B., Jr. (1991) Acridine andCholesterol-Derivatized Solid Supports for Improved Synthesis of3′-Modified Oligonucleotides. Bioconjugate Chem. 2 217-225 (1993);herein incorporated by reference in their entireties. Additionallipophilic moieties that can be used, include, but are not limited to,oleyl, retinyl, and cholesteryl residues, cholic acid, adamantane aceticacid, 1-pyrene butyric acid, dihydrotestosterone,1,3-Bis-O(hexadecyl)glycerol, geranyloxyhexyl group, hexadecylglycerol,borneol, menthol, 1,3-propanediol, heptadecyl group, palmitic acid,myristic acid, O₃-(oleoyl)lithocholic acid, O₃-(oleoyl)cholenic acid,dimethoxytrityl, or phenoxazine. Additional compounds, and methods ofuse, are set out in US Patent Publication Nos. 2010/0076056,2009/0247608 and 2009/0131360; herein incorporated by reference in theirentireties.

In one embodiment, inhibition of PANDA function may be achieved byadministering antisense oligonucleotides targeting PANDA. The antisenseoligonucleotides may be ribonucleotides or deoxyribonucleotides.Preferably, the antisense oligonucleotides have at least one chemicalmodification. Antisense oligonucleotides may be comprised of one or more“locked nucleic acids”. “Locked nucleic acids” (LNAs) are modifiedribonucleotides that contain an extra bridge between the 2′ and 4′carbons of the ribose sugar moiety resulting in a “locked” conformationthat confers enhanced thermal stability to oligonucleotides containingthe LNAs. Alternatively, the antisense oligonucleotides may comprisepeptide nucleic acids (PNAs), which contain a peptide-based backbonerather than a sugar-phosphate backbone. The antisense oligonucleotidesmay contain one or more chemical modifications, including, but are notlimited to, sugar modifications, such as 2′-O-alkyl (e.g. 2′-O-methyl,2′-O-methoxyethyl), 2′-fluoro, and 4′ thio modifications, and backbonemodifications, such as one or more phosphorothioate, morpholino, orphosphonocarboxylate linkages (see, for example, U.S. Pat. Nos.6,693,187 and 7,067,641, which are herein incorporated by reference intheir entireties). In some embodiments, suitable antisenseoligonucleotides are 2′-O-methoxyethyl “gapmers” which contain2′-O-methoxyethyl-modified ribonucleotides on both 5′ and 3′ ends withat least ten deoxyribonucleotides in the center. These “gapmers” arecapable of triggering RNase H-dependent degradation mechanisms of RNAtargets. Other modifications of antisense oligonucleotides to enhancestability and improve efficacy, such as those described in U.S. Pat. No.6,838,283, which is herein incorporated by reference in its entirety,are known in the art and are suitable for use in the methods of theinvention. Antisense oligonucleotides may comprise a sequence that is atleast partially complementary to a PANDA target sequence, e.g., at leastabout 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%complementary to the PANDA target sequence. In some embodiments, theantisense oligonucleotide may be substantially complementary to thePANDA target sequence, that is at least about 95%, 96%, 97%, 98%, or 99%complementary to a target polynucleotide sequence. In one embodiment,the antisense oligonucleotide comprises a sequence that is 100%complementary to the PANDA target sequence.

In another embodiment, the inhibitor of PANDA is an inhibitory RNAmolecule (e.g., a miRNA, a siRNA, a piRNA, or a snRNA) having asingle-stranded or double-stranded region that is at least partiallycomplementary to the target sequence of PANDA, e.g., about 75%, 80%,85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% complementaryto the target sequence of PANDA. In some embodiments, the inhibitory RNAcomprises a sequence that is substantially complementary to the targetsequence of PANDA, e.g., about 95%, 96%, 97%, 98%, or 99% complementaryto a target polynucleotide sequence. In other embodiments, theinhibitory RNA molecule may contain a region that has 100%complementarity to the target sequence. The inhibitory molecules maytarget the PANDA sequence of SEQ ID NO:1. In certain embodiments, theinhibitory RNA molecule may be a double-stranded, small interfering RNAor a short hairpin RNA molecule (shRNA) comprising a stem-loopstructure. In one embodiment, the PANDA inhibitor is an siRNA comprisinga nucleotide sequence selected from the group consisting of SEQ IDNOS:12-14.

An “effective amount” of a PANDA inhibitor (e.g., microRNA, siRNA,piRNA, snRNA, antisense oligonucleotide, ribozyme, or small moleculeinhibitor) is an amount sufficient to effect beneficial or desiredresults, such as an amount that reduces PANDA activity, for example, byinterfering with transcription of PANDA or interfering with binding ofPANDA to the transcription factor NF-YA. In some embodiments, a PANDAinhibitor reduces the amount and/or activity of PANDA by at least about10% to about 100%, 20% to about 100%, 30% to about 100%, 40% to about100%, 50% to about 100%, 60% to about 100%, 70% to about 100%, 10% toabout 90%, 20% to about 85%, 40% to about 84%, 60% to about 90%,including any percent within these ranges, such as but not limited to15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%,85%, 90%, 95%, 96%, 97%, 98%, and 99%.

In certain embodiments, the invention includes a method of modulatingthe activity of the transcription factor NF-YA in a cell, the methodcomprising introducing into the cell PANDA or an inhibitor of PANDA. Inone embodiment, the activity of NF-YA is increased in the cell followingadministration of an inhibitor of PANDA. In another embodiment, theactivity of NF-YA is decreased in the cell following administration ofPANDA.

In certain embodiments, the invention includes a method of modulatingthe expression of one or more apoptotic genes in a cell, the methodcomprising introducing into the cell PANDA or an inhibitor of PANDA. Inone embodiment, the expression of one or more apoptotic genes isincreased in the cell following administration of an inhibitor of PANDA.In another embodiment, the expression of one or more apoptotic genes isdecreased in the cell following administration of PANDA.

Inhibitors can be detectably labeled by well-known techniques.Detectable labels include, for example, radioactive isotopes,fluorescent labels, chemiluminescent labels, bioluminescent labels andenzyme labels. Such labeled inhibitors can be used to determine cellularuptake efficiency, quantitate binding of inhibitors at target sites, orvisualize inhibitor localization.

In certain embodiments, PANDA or a PANDA inhibitor is expressed in vivofrom a vector. A “vector” is a composition of matter which can be usedto deliver a nucleic acid of interest to the interior of a cell.Numerous vectors are known in the art including, but not limited to,linear polynucleotides, polynucleotides associated with ionic oramphiphilic compounds, plasmids, and viruses. Thus, the term “vector”includes an autonomously replicating plasmid or a virus. Examples ofviral vectors include, but are not limited to, adenoviral vectors,adeno-associated virus vectors, retroviral vectors, lentiviral vectors,and the like. An expression construct can be replicated in a livingcell, or it can be made synthetically. For purposes of this application,the terms “expression construct,” “expression vector,” and “vector,” areused interchangeably to demonstrate the application of the invention ina general, illustrative sense, and are not intended to limit theinvention.

In one embodiment, an expression vector for expressing PANDA or a PANDAinhibitor comprises a promoter “operably linked” to a polynucleotideencoding PANDA or a PANDA inhibitor. The phrase “operably linked” or“under transcriptional control” as used herein means that the promoteris in the correct location and orientation in relation to apolynucleotide to control the initiation of transcription by RNApolymerase and expression of the polynucleotide.

In certain embodiments, the nucleic acid encoding a polynucleotide ofinterest is under transcriptional control of a promoter. A “promoter”refers to a DNA sequence recognized by the synthetic machinery of thecell, or introduced synthetic machinery, required to initiate thespecific transcription of a gene. The term promoter will be used here torefer to a group of transcriptional control modules that are clusteredaround the initiation site for RNA polymerase I, II, or III. Typicalpromoters for mammalian cell expression include the SV40 early promoter,a CMV promoter such as the CMV immediate early promoter (see, U.S. Pat.Nos. 5,168,062 and 5,385,839, incorporated herein by reference in theirentireties), the mouse mammary tumor virus LTR promoter, the adenovirusmajor late promoter (Ad MLP), and the herpes simplex virus promoter,among others. Other nonviral promoters, such as a promoter derived fromthe murine metallothionein gene, will also find use for mammalianexpression. These and other promoters can be obtained from commerciallyavailable plasmids, using techniques well known in the art. See, e.g.,Sambrook et al., supra. Enhancer elements may be used in associationwith the promoter to increase expression levels of the constructs.Examples include the SV40 early gene enhancer, as described in Dijkemaet al., EMBO J. (1985) 4:761, the enhancer/promoter derived from thelong terminal repeat (LTR) of the Rous Sarcoma Virus, as described inGorman et al., Proc. Natl. Acad. Sci. USA (1982b) 79:6777 and elementsderived from human CMV, as described in Boshart et al., Cell (1985)41:521, such as elements included in the CMV intron A sequence.

Typically, transcription terminator/polyadenylation signals will also bepresent in the expression construct. Examples of such sequences include,but are not limited to, those derived from SV40, as described inSambrook et al., supra, as well as a bovine growth hormone terminatorsequence (see, e.g., U.S. Pat. No. 5,122,458). Additionally, 5′-UTRsequences can be placed adjacent to the coding sequence in order toenhance expression of the same. Such sequences include UTRs whichinclude an Internal Ribosome Entry Site (IRES) present in the leadersequences of picornaviruses such as the encephalomyocarditis virus(EMCV) UTR (Jang et al. J. Virol. (1989) 63:1651-1660. Otherpicornavirus UTR sequences that will also find use in the presentinvention include the polio leader sequence and hepatitis A virus leaderand the hepatitis C IRES.

In certain embodiments of the invention, the cells containing nucleicacid constructs of the present invention may be identified in vitro orin vivo by including a marker in the expression construct. Such markerswould confer an identifiable change to the cell permitting easyidentification of cells containing the expression construct. Usually theinclusion of a drug selection marker aids in cloning and in theselection of transformants, for example, genes that confer resistance toneomycin, puromycin, hygromycin, DHFR, GPT, zeocin and histidinol areuseful selectable markers. Alternatively, enzymes such as herpes simplexvirus thymidine kinase (tk) or chloramphenicol acetyltransferase (CAT)may be employed. Fluorescent markers (e.g., green fluorescent protein(GFP), EGFP, or Dronpa), or immunologic markers can also be employed.The selectable marker employed is not believed to be important, so longas it is capable of being expressed simultaneously with the nucleic acidencoding a gene product. Further examples of selectable markers are wellknown to one of skill in the art.

There are a number of ways in which expression vectors may be introducedinto cells. In certain embodiments of the invention, the expressionconstruct comprises a virus or engineered construct derived from a viralgenome. The ability of certain viruses to enter cells viareceptor-mediated endocytosis, to integrate into host cell genome andexpress viral genes stably and efficiently have made them attractivecandidates for the transfer of foreign genes into mammalian cells(Ridgeway, 1988; Nicolas and Rubenstein, 1988; Baichwal and Sugden,1986; Temin, 1986).

One of the preferred methods for in vivo delivery involves the use of anadenovirus expression vector. “Adenovirus expression vector” is meant toinclude those constructs containing adenovirus sequences sufficient to(a) support packaging of the construct and (b) to express apolynucleotide that has been cloned therein. The expression vectorcomprises a genetically engineered form of adenovirus. Knowledge of thegenetic organization of adenovirus, a 36 kB, linear, double-stranded DNAvirus, allows substitution of large pieces of adenoviral DNA withforeign sequences up to 7 kB (Grunhaus and Horwitz, 1992). In contrastto retrovirus, the adenoviral infection of host cells does not result inchromosomal integration because adenoviral DNA can replicate in anepisomal manner without potential genotoxicity. Also, adenoviruses arestructurally stable, and no genome rearrangement has been detected afterextensive amplification. Adenovirus can infect virtually all epithelialcells regardless of their cell cycle stage.

Adenovirus is particularly suitable for use as a gene transfer vectorbecause of its mid-sized genome, ease of manipulation, high titer, widetarget cell range and high infectivity. Both ends of the viral genomecontain 100-200 base pair inverted repeats (ITRs), which are ciselements necessary for viral DNA replication and packaging.

Other than the requirement that the adenovirus vector be replicationdefective, or at least conditionally defective, the nature of theadenovirus vector is not believed to be crucial to the successfulpractice of the invention. The adenovirus may be of any of the 42different known serotypes or subgroups A-F. Adenovirus type 5 ofsubgroup C is the preferred starting material in order to obtain theconditional replication-defective adenovirus vector for use in thepresent invention. This is because Adenovirus type 5 is a humanadenovirus about which a great deal of biochemical and geneticinformation is known, and it has historically been used for mostconstructions employing adenovirus as a vector.

The typical vector according to the present invention is replicationdefective and will not have an adenovirus E1 region. Thus, it will bemost convenient to introduce the polynucleotide encoding the gene ofinterest at the position from which the E1-coding sequences have beenremoved. However, the position of insertion of the construct within theadenovirus sequences is not critical to the invention. Thepolynucleotide encoding the gene of interest may also be inserted inlieu of the deleted E3 region in E3 replacement vectors, as described byKarlsson et al. (1986), or in the E4 region where a helper cell line orhelper virus complements the E4 defect.

Adenovirus vectors have been used in eukaryotic gene expression (Levreroet al., 1991; Gomez-Foix et al., 1992) and vaccine development (Grunhausand Horwitz, 1992; Graham and Prevec, 1991). Recently, animal studiessuggested that recombinant adenovirus could be used for gene therapy(Stratford-Perricaudet and Perricaudet, 1991; Stratford-Perricaudet etal., 1990; Rich et al., 1993). Studies in administering recombinantadenovirus to different tissues include trachea instillation (Rosenfeldet al., 1991; Rosenfeld et al., 1992), muscle injection (Ragot et al.,1993), peripheral intravenous injections (Herz and Gerard, 1993) andstereotactic inoculation into the brain (Le Gal La Salle et al., 1993).

Retroviral vectors are also suitable for expressing PANDA or PANDAinhibitors in cells. The retroviruses are a group of single-stranded RNAviruses characterized by an ability to convert their RNA todouble-stranded DNA in infected cells by a process ofreverse-transcription (Coffin, 1990). The resulting DNA then stablyintegrates into cellular chromosomes as a provirus and directs synthesisof viral proteins. The integration results in the retention of the viralgene sequences in the recipient cell and its descendants. The retroviralgenome contains three genes, gag, pol, and env that code for capsidproteins, polymerase enzyme, and envelope components, respectively. Asequence found upstream from the gag gene contains a signal forpackaging of the genome into virions. Two long terminal repeat (LTR)sequences are present at the 5′ and 3′ ends of the viral genome. Thesecontain strong promoter and enhancer sequences and are also required forintegration in the host cell genome (Coffin, 1990).

In order to construct a retroviral vector, a nucleic acid encoding agene of interest is inserted into the viral genome in the place ofcertain viral sequences to produce a virus that isreplication-defective. In order to produce virions, a packaging cellline containing the gag, pol, and env genes but without the LTR andpackaging components is constructed (Mann et al., 1983). When arecombinant plasmid containing a cDNA, together with the retroviral LTRand packaging sequences is introduced into this cell line (by calciumphosphate precipitation for example), the packaging sequence allows theRNA transcript of the recombinant plasmid to be packaged into viralparticles, which are then secreted into the culture media (Nicolas andRubenstein, 1988; Temin, 1986; Mann et al., 1983). The media containingthe recombinant retroviruses is then collected, optionally concentrated,and used for gene transfer. Retroviral vectors are able to infect abroad variety of cell types. However, integration and stable expressionrequire the division of host cells (Paskind et al., 1975).

Other viral vectors may be employed as expression constructs in thepresent invention. Vectors derived from viruses such as vaccinia virus(Ridgeway, 1988; Baichwal and Sugden, 1986; Coupar et al., 1988)adeno-associated virus (AAV) (Ridgeway, 1988; Baichwal and Sugden, 1986;Hermonat and Muzycska, 1984) and herpesviruses may be employed. Theyoffer several attractive features for various mammalian cells(Friedmann, 1989; Ridgeway, 1988; Baichwal and Sugden, 1986; Coupar etal., 1988; Horwich et al., 1990).

In order to effect expression of sense or antisense gene constructs, theexpression construct must be delivered into a cell. This delivery may beaccomplished in vitro, as in laboratory procedures for transformingcells lines, or in vivo or ex vivo, as in the treatment of certaindisease states. One mechanism for delivery is via viral infection wherethe expression construct is encapsidated in an infectious viralparticle.

Several non-viral methods for the transfer of expression constructs intocultured mammalian cells also are contemplated by the present invention.These include calcium phosphate precipitation (Graham and Van Der Eb,1973; Chen and Okayama, 1987; Rippe et al., 1990) DEAE-dextran (Gopal,1985), electroporation (Tur-Kaspa et al., 1986; Porter et al., 1984),direct microinjection (Harland and Weintraub, 1985), DNA-loadedliposomes (Nicolau and Sene, 1982; Fraley et al., 1979) andlipofectamine-DNA complexes, cell sonication (Fechheimer et al., 1987),gene bombardment using high velocity microprojectiles (Yang et al.,1990), and receptor-mediated transfection (Wu and Wu, 1987; Wu and Wu,1988). Some of these techniques may be successfully adapted for in vivoor ex vivo use.

Once the expression construct has been delivered into the cell thenucleic acid encoding PANDA or the PANDA inhibitor of interest may bepositioned and expressed at different sites. In certain embodiments, thenucleic acid encoding PANDA or a PANDA inhibitor may be stablyintegrated into the genome of the cell. This integration may be in thecognate location and orientation via homologous recombination (genereplacement) or it may be integrated in a random, non-specific location(gene augmentation). In yet further embodiments, the nucleic acid may bestably maintained in the cell as a separate, episomal segment of DNA.Such nucleic acid segments or “episomes” encode sequences sufficient topermit maintenance and replication independent of or in synchronizationwith the host cell cycle. How the expression construct is delivered to acell and where in the cell the nucleic acid remains is dependent on thetype of expression construct employed.

In yet another embodiment of the invention, the expression construct maysimply consist of naked recombinant DNA or plasmids. Transfer of theconstruct may be performed by any of the methods mentioned above whichphysically or chemically permeabilize the cell membrane. This isparticularly applicable for transfer in vitro but it may be applied toin vivo use as well. Dubensky et al. (1984) successfully injectedpolyomavirus DNA in the form of calcium phosphate precipitates intoliver and spleen of adult and newborn mice demonstrating active viralreplication and acute infection. Benvenisty and Neshif (1986) alsodemonstrated that direct intraperitoneal injection of calciumphosphate-precipitated plasmids results in expression of the transfectedgenes. It is envisioned that DNA encoding a gene of interest may also betransferred in a similar manner in vivo and express the gene product.

In still another embodiment of the invention for transferring a nakedDNA expression construct into cells may involve particle bombardment.This method depends on the ability to accelerate DNA-coatedmicroprojectiles to a high velocity allowing them to pierce cellmembranes and enter cells without killing them (Klein et al., 1987).Several devices for accelerating small particles have been developed.One such device relies on a high voltage discharge to generate anelectrical current, which in turn provides the motive force (Yang etal., 1990). The microprojectiles used have consisted of biologicallyinert substances such as tungsten or gold beads.

In a further embodiment of the invention, the expression construct maybe entrapped in a liposome. Liposomes are vesicular structurescharacterized by a phospholipid bilayer membrane and an inner aqueousmedium. Multilamellar liposomes have multiple lipid layers separated byaqueous medium. They form spontaneously when phospholipids are suspendedin an excess of aqueous solution. The lipid components undergoself-rearrangement before the formation of closed structures and entrapwater and dissolved solutes between the lipid bilayers (Ohosh andBachhawat, 1991). Also contemplated are lipofectamine-DNA complexes.

In certain embodiments of the invention, the liposome may be complexedwith a hemagglutinating virus (HVJ). This has been shown to facilitatefusion with the cell membrane and promote cell entry ofliposome-encapsulated DNA (Kaneda et al., 1989). In other embodiments,the liposome may be complexed or employed in conjunction with nuclearnon-histone chromosomal proteins (HMG-I) (Kato et al., 1991). In yetfurther embodiments, the liposome may be complexed or employed inconjunction with both HVJ and HMG-I. In that such expression constructshave been successfully employed in transfer and expression of nucleicacid in vitro and in vivo, then they are applicable for the presentinvention. Where a bacterial promoter is employed in the DNA construct,it also will be desirable to include within the liposome an appropriatebacterial polymerase.

Other expression constructs which can be employed to deliver a nucleicacid encoding a particular lncRNA or inhibitor into cells arereceptor-mediated delivery vehicles. These take advantage of theselective uptake of macromolecules by receptor-mediated endocytosis inalmost all eukaryotic cells. Because of the cell type-specificdistribution of various receptors, the delivery can be highly specific(Wu and Wu, 1993).

Receptor-mediated gene targeting vehicles generally consist of twocomponents: a cell receptor-specific ligand and a DNA-binding agent.Several ligands have been used for receptor-mediated gene transfer. Themost extensively characterized ligands are asialoorosomucoid (ASOR) (Wuand Wu, 1987) and transferrin (Wagner et al., 1990). Recently, asynthetic neoglycoprotein, which recognizes the same receptor as ASOR,has been used as a gene delivery vehicle (Ferkol et al., 1993; Peraleset al., 1994) and epidermal growth factor (EGF) has also been used todeliver genes to squamous carcinoma cells (Myers, EPO 0273085).

In other embodiments, the delivery vehicle may comprise a ligand and aliposome. For example, Nicolau et al. (1987) employed lactosyl-ceramide,a galactose-terminal asialganglioside, incorporated into liposomes andobserved an increase in the uptake of the insulin gene by hepatocytes.Thus, it is feasible that a nucleic acid encoding a particular gene alsomay be specifically delivered into a cell type by any number ofreceptor-ligand systems with or without liposomes. For example,epidermal growth factor (EGF) may be used as the receptor for mediateddelivery of a nucleic acid into cells that exhibit upregulation of EGFreceptor. Mannose can be used to target the mannose receptor on livercells. Also, antibodies to CD5 (CLL), CD22 (lymphoma), CD25 (T-cellleukemia) and MAA (melanoma) can similarly be used as targetingmoieties.

In a particular example, the oligonucleotide may be administered incombination with a cationic lipid. Examples of cationic lipids include,but are not limited to, lipofectin, DOTMA, DOPE, and DOTAP. Thepublication of WO/0071096, which is specifically incorporated byreference, describes different formulations, such as a DOTAP:cholesterolor cholesterol derivative formulation that can effectively be used forgene therapy. Other disclosures also discuss different lipid orliposomal formulations including nanoparticles and methods ofadministration; these include, but are not limited to, U.S. PatentPublication 20030203865, 20020150626, 20030032615, and 20040048787,which are specifically incorporated by reference to the extent theydisclose formulations and other related aspects of administration anddelivery of nucleic acids. Methods used for forming particles are alsodisclosed in U.S. Pat. Nos. 5,844,107, 5,877,302, 6,008,336, 6,077,835,5,972,901, 6,200,801, and 5,972,900, which are incorporated by referencefor those aspects.

In certain embodiments, gene transfer may more easily be performed underex vivo conditions. Ex vivo gene therapy refers to the isolation ofcells from an animal, the delivery of a nucleic acid into the cells invitro, and then the return of the modified cells back into an animal.This may involve the surgical removal of tissue/organs from an animal orthe primary culture of cells and tissues.

The present invention also encompasses pharmaceutical compositionscomprising PANDA or one or more PANDA inhibitors and a pharmaceuticallyacceptable carrier. Where clinical applications are contemplated,pharmaceutical compositions will be prepared in a form appropriate forthe intended application. Generally, this will entail preparingcompositions that are essentially free of pyrogens, as well as otherimpurities that could be harmful to humans or animals.

Colloidal dispersion systems, such as macromolecule complexes,nanocapsules, microspheres, beads, and lipid-based systems includingoil-in-water emulsions, micelles, mixed micelles, and liposomes, may beused as delivery vehicles for PANDA or PANDA inhibitors describedherein. Commercially available fat emulsions that are suitable fordelivering the nucleic acids of the invention to tissues, such ascardiac muscle tissue and smooth muscle tissue, include Intralipid,Liposyn, Liposyn II, Liposyn III, Nutrilipid, and other similar lipidemulsions. A preferred colloidal system for use as a delivery vehicle invivo is a liposome (i.e., an artificial membrane vesicle). Thepreparation and use of such systems is well known in the art. Exemplaryformulations are also disclosed in U.S. Pat. No. 5,981,505; U.S. Pat.No. 6,217,900; U.S. Pat. No. 6,383,512; U.S. Pat. No. 5,783,565; U.S.Pat. No. 7,202,227; U.S. Pat. No. 6,379,965; U.S. Pat. No. 6,127,170;U.S. Pat. No. 5,837,533; U.S. Pat. No. 6,747,014; and WO 03/093449,which are herein incorporated by reference in their entireties.

One will generally desire to employ appropriate salts and buffers torender delivery vehicles stable and allow for uptake by target cells.Buffers also will be employed when recombinant cells are introduced intoa patient. Aqueous compositions of the present invention comprise aneffective amount of the delivery vehicle, dissolved or dispersed in apharmaceutically acceptable carrier or aqueous medium. The phrases“pharmaceutically acceptable” or “pharmacologically acceptable” refersto molecular entities and compositions that do not produce adverse,allergic, or other untoward reactions when administered to an animal ora human. As used herein, “pharmaceutically acceptable carrier” includessolvents, buffers, solutions, dispersion media, coatings, antibacterialand antifungal agents, isotonic and absorption delaying agents and thelike acceptable for use in formulating pharmaceuticals, such aspharmaceuticals suitable for administration to humans. The use of suchmedia and agents for pharmaceutically active substances is well known inthe art. Except insofar as any conventional media or agent isincompatible with the active ingredients of the present invention, itsuse in therapeutic compositions is contemplated. Supplementary activeingredients also can be incorporated into the compositions, providedthey do not inactivate the nucleic acids of the compositions.

The pharmaceutical forms suitable for injectable use or catheterdelivery include, for example, sterile aqueous solutions or dispersionsand sterile powders for the extemporaneous preparation of sterileinjectable solutions or dispersions. Generally, these preparations aresterile and fluid to the extent that easy injectability exists.Preparations should be stable under the conditions of manufacture andstorage and should be preserved against the contaminating action ofmicroorganisms, such as bacteria and fungi. Appropriate solvents ordispersion media may contain, for example, water, ethanol, polyol (forexample, glycerol, propylene glycol, and liquid polyethylene glycol, andthe like), suitable mixtures thereof, and vegetable oils. The properfluidity can be maintained, for example, by the use of a coating, suchas lecithin, by the maintenance of the required particle size in thecase of dispersion and by the use of surfactants. The prevention of theaction of microorganisms can be brought about by various antibacterialan antifungal agents, for example, parabens, chlorobutanol, phenol,sorbic acid, thimerosal, and the like. In many cases, it will bepreferable to include isotonic agents, for example, sugars or sodiumchloride. Prolonged absorption of the injectable compositions can bebrought about by the use in the compositions of agents delayingabsorption, for example, aluminum monostearate and gelatin.

Sterile injectable solutions may be prepared by incorporating the activecompounds in an appropriate amount into a solvent along with any otheringredients (for example as enumerated above) as desired, followed byfiltered sterilization. Generally, dispersions are prepared byincorporating the various sterilized active ingredients into a sterilevehicle which contains the basic dispersion medium and the desired otheringredients, e.g., as enumerated above. In the case of sterile powdersfor the preparation of sterile injectable solutions, the preferredmethods of preparation include vacuum-drying and freeze-dryingtechniques which yield a powder of the active ingredient(s) plus anyadditional desired ingredient from a previously sterile-filteredsolution thereof.

The compositions of the present invention generally may be formulated ina neutral or salt form. Pharmaceutically-acceptable salts include, forexample, acid addition salts (formed with the free amino groups of theprotein) derived from inorganic acids (e.g., hydrochloric or phosphoricacids, or from organic acids (e.g., acetic, oxalic, tartaric, mandelic,and the like. Salts formed with the free carboxyl groups of the proteincan also be derived from inorganic bases (e.g., sodium, potassium,ammonium, calcium, or ferric hydroxides) or from organic bases (e.g.,isopropylamine, trimethylamine, histidine, procaine and the like).

Upon formulation, solutions are preferably administered in a mannercompatible with the dosage formulation and in such amount as istherapeutically effective. The formulations may easily be administeredin a variety of dosage forms such as injectable solutions, drug releasecapsules and the like. For parenteral administration in an aqueoussolution, for example, the solution generally is suitably buffered andthe liquid diluent first rendered isotonic for example with sufficientsaline or glucose. Such aqueous solutions may be used, for example, forintravenous, intramuscular, subcutaneous and intraperitonealadministration. Preferably, sterile aqueous media are employed as isknown to those of skill in the art, particularly in light of the presentdisclosure. By way of illustration, a single dose may be dissolved in 1ml of isotonic NaCl solution and either added to 1000 ml ofhypodermoclysis fluid or injected at the proposed site of infusion, (seefor example, “Remington's Pharmaceutical Sciences” 15th Edition, pages1035-1038 and 1570-1580). Some variation in dosage will necessarilyoccur depending on the condition of the subject being treated. Theperson responsible for administration will, in any event, determine theappropriate dose for the individual subject. Moreover, for humanadministration, preparations should meet sterility, pyrogenicity, andgeneral safety and purity standards as required by FDA Office ofBiologics standards.

D. Administration

At least one therapeutically effective dose of a PANDA inhibitor and atleast one chemotherapeutic agent will be administered. The PANDAinhibitor may be an antisense oligonucleotide or inhibitory RNA moleculesuch as, a miRNA, siRNA, piRNA, or snRNA, as described herein.Chemotherapeutic agents that can be used include, but are not limitedto, abitrexate, adriamycin, adrucil, amsacrine, asparaginase,anthracyclines, azacitidine, azathioprine, bicnu, blenoxane, busulfan,bleomycin, camptosar, camptothecins, carboplatin, carmustine,cerubidine, chlorambucil, cisplatin, cladribine, cosmegen, cytarabine,cytosar, cyclophosphamide, cytoxan, dactinomycin, docetaxel,doxorubicin, daunorubicin, ellence, elspar, epirubicin, etoposide,fludarabine, fluorouracil, fludara, gemcitabine, gemzar, hycamtin,hydroxyurea, hydrea, idamycin, idarubicin, ifosfamide, ifex, irinotecan,lanvis, leukeran, leustatin, matulane, mechlorethamine, mercaptopurine,methotrexate, mitomycin, mitoxantrone, mithramycin, mutamycin, myleran,mylosar, navelbine, nipent, novantrone, oncovin, oxaliplatin,paclitaxel, paraplatin, pentostatin, platinol, plicamycin, procarbazine,purinethol, ralitrexed, taxotere, taxol, teniposide, thioguanine,tomudex, topotecan, valrubicin, velban, vepesid, vinblastine, vindesine,vincristine, vinorelbine, VP-16, and vumon.

By “therapeutically effective dose or amount” of each of these agents isintended an amount that when administered in combination brings about apositive therapeutic response with respect to treatment of an individualfor cancer. Of particular interest is an amount of these agents thatprovides an anti-tumor effect, as defined herein. By “positivetherapeutic response” is intended the individual undergoing thecombination treatment according to the invention exhibits an improvementin one or more symptoms of the cancer for which the individual isundergoing therapy.

Thus, for example, a “positive therapeutic response” would be animprovement in the disease in association with the combination therapy,and/or an improvement in one or more symptoms of the disease inassociation with the combination therapy. Therefore, for example, apositive therapeutic response would refer to one or more of thefollowing improvements in the disease: (1) reduction in tumor size; (2)reduction in the number of cancer cells; (3) inhibition (i.e., slowingto some extent, preferably halting) of tumor growth; (4) inhibition(i.e., slowing to some extent, preferably halting) of cancer cellinfiltration into peripheral organs; (5) inhibition (i.e., slowing tosome extent, preferably halting) of tumor metastasis; and (6) someextent of relief from one or more symptoms associated with the cancer.Such therapeutic responses may be further characterized as to degree ofimprovement. Thus, for example, an improvement may be characterized as acomplete response. By “complete response” is documentation of thedisappearance of all symptoms and signs of all measurable or evaluabledisease confirmed by physical examination, laboratory, nuclear andradiographic studies (i.e., CT (computer tomography) and/or MRI(magnetic resonance imaging)), and other non-invasive proceduresrepeated for all initial abnormalities or sites positive at the time ofentry into the study. Alternatively, an improvement in the disease maybe categorized as being a partial response. By “partial response” isintended a reduction of greater than 50% in the sum of the products ofthe perpendicular diameters of all measurable lesions when compared withpretreatment measurements.

The actual dose to be administered will vary depending upon the age,weight, and general condition of the subject as well as the severity ofthe condition being treated, the judgment of the health careprofessional, and conjugate being administered. Therapeuticallyeffective amounts can be determined by those skilled in the art, andwill be adjusted to the particular requirements of each particular case.Generally, a therapeutically effective amount will range from about 0.50mg to 5 grams NSAID daily, more preferably from about 5 mg to 2 gramsdaily, even more preferably from about 7 mg to 1.5 grams daily.Preferably, such doses are in the range of 10-600 mg four times a day(QID), 200-500 mg QID, 25-600 mg three times a day (TID), 25-50 mg TID,50-100 mg TID, 50-200 mg TID, 300-600 mg TID, 200-400 mg TID, 200-600 mgTID, 100 to 700 mg twice daily (BID), 100-600 mg BID, 200-500 mg BID, or200-300 mg BID.

In certain embodiments, multiple therapeutically effective doses of eachof at least one PANDA inhibitor and at least one chemotherapeutic agentwill be administered according to a daily dosing regimen, orintermittently. For example, a therapeutically effective dose can beadministered, one day a week, two days a week, three days a week, fourdays a week, or five days a week, and so forth. By “intermittent”administration is intended the therapeutically effective dose can beadministered, for example, every other day, every two days, every threedays, and so forth. For example, in some embodiments, at least one PANDAinhibitor and at least one chemotherapeutic agent, will be administeredtwice-weekly or thrice-weekly for an extended period of time, such asfor 1, 2, 3, 4, 5, 6, 7, 8 . . . 10 . . . 15 . . . 24 weeks, and soforth. By “twice-weekly” or “two times per week” is intended that twotherapeutically effective doses of the agent in question is administeredto the subject within a 7 day period, beginning on day 1 of the firstweek of administration, with a minimum of 72 hours, between doses and amaximum of 96 hours between doses. By “thrice weekly” or “three timesper week” is intended that three therapeutically effective doses areadministered to the subject within a 7 day period, allowing for aminimum of 48 hours between doses and a maximum of 72 hours betweendoses. For purposes of the present invention, this type of dosing isreferred to as “intermittent” therapy. In accordance with the methods ofthe present invention, a subject can receive intermittent therapy (i.e.,twice-weekly or thrice-weekly administration of a therapeuticallyeffective dose) for one or more weekly cycles until the desiredtherapeutic response is achieved. The agents can be administered by anyacceptable route of administration as noted herein below.

A PANDA inhibitor can be administered prior to, concurrent with, orsubsequent to at least one chemotherapeutic agent. If provided at thesame time as the chemotherapeutic agent, the PANDA inhibitor can beprovided in the same or in a different composition. Thus, the agents canbe presented to the individual by way of concurrent therapy. By“concurrent therapy” is intended administration to a human subject suchthat the therapeutic effect of the combination of the substances iscaused in the subject undergoing therapy. For example, concurrenttherapy may be achieved by administering at least one therapeuticallyeffective dose of a pharmaceutical composition comprising a PANDAinhibitor and at least one therapeutically effective dose of apharmaceutical composition comprising at least one chemotherapeuticagent according to a particular dosing regimen. Administration of theseparate pharmaceutical compositions can be at the same time (i.e.,simultaneously) or at different times (i.e., sequentially, in eitherorder, on the same day, or on different days), so long as thetherapeutic effect of the combination of these substances is caused inthe subject undergoing therapy.

In certain embodiments, the PANDA inhibitor is administered for a briefperiod prior to administration of the chemotherapeutic agent andcontinued for a brief period after treatment with the chemotherapeuticagent is discontinued in order to ensure that the PANDA inhibitor levelsare adequate in the subject during chemotherapy. For example, the PANDAinhibitor can be administered starting one week before administration ofthe first dose of the chemotherapeutic agent and continued for one weekafter administration of the last dose of the chemotherapeutic agent tothe subject.

In other embodiments of the invention, the pharmaceutical compositionscomprising the agents, such as one or more PANDA inhibitors and/orchemotherapeutic agents, is a sustained-release formulation, or aformulation that is administered using a sustained-release device. Suchdevices are well known in the art, and include, for example, transdermalpatches, and miniature implantable pumps that can provide for drugdelivery over time in a continuous, steady-state fashion at a variety ofdoses to achieve a sustained-release effect with a non-sustained-releasepharmaceutical composition.

The pharmaceutical compositions comprising one or more PANDA inhibitorsor chemotherapeutic agents may be administered using the same ordifferent routes of administration in accordance with any medicallyacceptable method known in the art. Suitable routes of administrationinclude parenteral administration, such as subcutaneous (SC),intraperitoneal (IP), intramuscular (IM), intravenous (IV), or infusion,oral and pulmonary, nasal, topical, transdermal, and suppositories.Where the composition is administered via pulmonary delivery, thetherapeutically effective dose is adjusted such that the soluble levelof the agent, such as the PANDA inhibitor in the bloodstream, isequivalent to that obtained with a therapeutically effective dose thatis administered parenterally, for example SC, IP, 1M, or 1V. In someembodiments of the invention, the pharmaceutical composition comprisingthe PANDA inhibitor is administered by IM or SC injection, particularlyby IM or SC injection locally to the region where the therapeutic agentor agents used in the cancer therapy protocol are administered.

Factors influencing the respective amount of the various compositions tobe administered include, but are not limited to, the mode ofadministration, the frequency of administration (i.e., daily, orintermittent administration, such as twice- or thrice-weekly), theparticular disease undergoing therapy, the severity of the disease, thehistory of the disease, whether the individual is undergoing concurrenttherapy with another therapeutic agent, and the age, height, weight,health, and physical condition of the individual undergoing therapy.Generally, a higher dosage of this agent is preferred with increasingweight of the subject undergoing therapy.

Where a subject undergoing therapy in accordance with the previouslymentioned dosing regimens exhibits a partial response, or a relapsefollowing a prolonged period of remission, subsequent courses ofconcurrent therapy may be needed to achieve complete remission of thedisease. Thus, subsequent to a period of time off from a first treatmentperiod, a subject may receive one or more additional treatment periodscomprising chemotherapy in combination with a PANDA inhibitor. Such aperiod of time off between treatment periods is referred to herein as atime period of discontinuance. It is recognized that the length of thetime period of discontinuance is dependent upon the degree of tumorresponse (i.e., complete versus partial) achieved with any priortreatment periods of concurrent therapy with these therapeutic agents.

E. Kits

Any of the compositions described herein may be included in a kit. Forexample, PANDA, and/or at least one PANDA inhibitor, and/or at least onechemotherapeutic agent, or any combination thereof, may be included in akit. The kit may also include one or more transfection reagents tofacilitate delivery of oligonucleotides or polynucleotides to cells.

The components of the kit may be packaged either in aqueous media or inlyophilized form. The container means of the kits will generally includeat least one vial, test tube, flask, bottle, syringe or other containermeans, into which a component may be placed, and preferably, suitablyaliquoted. Where there is more than one component in the kit (labelingreagent and label may be packaged together), the kit also will generallycontain a second, third or other additional container into which theadditional components may be separately placed. However, variouscombinations of components may be comprised in a vial. The kits of thepresent invention also will typically include a means for containing thenucleic acids, and any other reagent containers in close confinement forcommercial sale. Such containers may include injection or blow-moldedplastic containers into which the desired vials are retained.

When the components of the kit are provided in one and/or more liquidsolutions, the liquid solution is an aqueous solution, with a sterileaqueous solution being particularly preferred. However, the componentsof the kit may be provided as dried powder(s). When reagents and/orcomponents are provided as a dry powder, the powder can be reconstitutedby the addition of a suitable solvent. It is envisioned that the solventmay also be provided in another container means.

The container means will generally include at least one vial, test tube,flask, bottle, syringe and/or other container means, into which thenucleic acid formulations are placed, preferably, suitably allocated.The kits may also comprise a second container means for containing asterile, pharmaceutically acceptable buffer and/or other diluent.

Such kits may also include components that preserve or maintain thePANDA inhibitors or lncRNAs or that protect against their degradation.Such components may be RNAse-free or protect against RNAses. Such kitsgenerally will comprise, in suitable means, distinct containers for eachindividual reagent or solution.

A kit will also include instructions for employing the kit components aswell the use of any other reagent not included in the kit. Instructionsmay include variations that can be implemented. A kit may also includeutensils or devices for administering a PANDA inhibitor by variousadministration routes, such as parenteral or catheter administration orcoated stent.

III. EXPERIMENTAL

Below are examples of specific embodiments for carrying out the presentinvention. The examples are offered for illustrative purposes only, andare not intended to limit the scope of the present invention in any way.

Efforts have been made to ensure accuracy with respect to numbers used(e.g., amounts, temperatures, etc.), but some experimental error anddeviation should, of course, be allowed for.

Example 1 Extensive and Coordinated Transcription of Noncoding RNAswithin Cell-Cycle Promoters Introduction

In this study, we create an ultrahigh-resolution tiling microarray tointerrogate the transcriptional and chromatin landscape around the TSSsof 56 cell-cycle genes, including genes encoding all cyclins,cyclin-dependent kinases (CDKs) and cyclin-dependent kinase inhibitors(CDKIs). We analyze a diverse collection of cells and tissue samplesthat interrogate distinct perturbations in cell-growth control. Ourresults reveal a map of extensive and choreographed noncodingtranscription and identify a specific set of lncRNAs that function inthe DNA damage response.

Methods Tiling Array Design and RNA Hybridization

A custom tiling array (Roche NimbleGen) was designed at 5 bp resolutionacross 25 kb of the 9p21 region (which encompasses CDKN2A, P14ARF andCDKN2B), as well as from 10 kb upstream to 2 kb downstream of each TSSfrom 53 other cell-cycle genes, including those encoding cyclins, CDKsand CDKIs (Table 1). In addition, the HOXA and HOXD loci were placed onthe array as a control. Briefly, RNA was amplified (MessageAmp Kit,Ambion), reverse transcribed (RETROscript Kit, Ambion), labeled andhybridized according to the standard NimbleGen protocol.

TABLE 1 Tiling Array design Feature Coordinates Name Chromosome (HumanMarch 2 006 NCBI Build 36.1 hg18) 9p21 locus 9 21900000-22150000 CCNA113 35894632-35906659 CCNA2 4 122962330-122974342 CCNB1 568488668-68500750 CCNB2 15 57174611-57186627 CCNB3 X 50034275-50046275CCNC 6 100121225-100133411 CCND1 11 69155053-69167126 CCND2 124243198-4255223 CCND3 6 42015122-42027530 CCNE1 19 34984740-34997400CCNE2 8 95974605-95986660 CCNF 16 2409440-2421471 CCNG1 5162787154-162799204 CCNG2 4 78287401-78299550 CCNH 5 86742441-86754592CCNI 4 78214148-78226149 CCNJ 10 97783140-97795329 CCNJL 5159697177-159709177 CCNK 14 99007491-99019512 CCNL1 3158358577-158371176 CCNL2 1 1322552-1334571 CCNO 5 54563265-54575265CCNT1 12 47395048-47407048 CCNT2 2 135382862-135394875 CCNY 1035565959-35577959 CCNYL1 2 208274509-208286509 CCNYL2 1042268168-42280168 CCNYL3 16 34105360-34117360 CDK2 12 54636825-54648899CDK3 17 71499013-71511013 CDK4 12 56430344-56442431 CDK5 7150383893-150395929 CDK5R1 17 27828217-27840681 CDK5R2 2219522620-219534641 CDK6 7 92299148-92311148 CDK8 13 25716755-25728778CDK9 9 129578151-129590188 CDK10 16 88270578-88282613 CDKL1 1449930367-49942367 CDKL2 4 76772269-76784595 CDKL3 5 133728113-133740664CDKL4 2 39308177-39320177 CDKL5 X 18343677-18355677 CDKN1A 636744464-36756493 CDKN1B 12 12751575-12763663 CDKN1C 11 2861551-2873577CDKN2C 1 51196148-51210203 CDKN2D 19 10538631-10550655 CDKN3 1453923462-53935475 CNNM1 10 101070022-101082022 CNNM2 10104658064-104670853 CNNM3 2 96835714-96848658 CNNM4 2 96780365-96792606

Peak Calling

Robust multichip average normalized single channel data from each arraywere subjected to peak calling using the NimbleScan program (RocheNimbleGen) with a window size of 50. Peaks with a peak score greaterthan ten were considered significant transcriptional units. Peak callsfrom all 55 array samples were clustered using Galaxy (Carninci et al.(2005) Science 309:1559-1563, Taylor et al. (2007) Curr. Protoc.Bioinformatics Chapter 10, Unit 10.5; herein incorporated by reference),and only transcripts present in a minimum of 10% of the samples wereconsidered for further analysis. Transcripts were annotated as follows:‘genomic location (upstream of TSS of cell-cycle protein-coding gene,upst; exon of cell-cycle protein-coding gene, exon; intron of cell-cycleprotein-coding gene, int; downstream of cell-cycle protein coding gene,dst)’; ‘gene symbol of nearest mRNA’; ‘distance from TSS’.

Measuring Protein Coding Potential

To assess the coding potential of the new transcribed regions, weevaluated the evolutionary signatures in their alignments withorthologous regions in 20 other sequenced placental mammalian genomesusing the codon substitution frequencies (CSF) method (Lin et al. NaturePrecedings published online, doi:10.1038/npre.2010.4784.1 (18 Aug.2010); Lin et al. (2007) Genome Res. 17:1823-1836; Lin et al. (2008)PLOS Comput. Biol. 4, e1000067, herein incorporated by reference intheir entireties), which has also been applied to assess new transcribedregions in mouse 14. CSF produces a score for any region in the genomeconsidering all codon substitutions observed within its alignment, basedon the relative frequency of similar substitutions in known coding andnoncoding regions. Briefly, CSF performs a statistical comparisonbetween two empirical codon models (Kosiol et al. (2007) Mol. Biol.Evol. 24, 1464-1479), one estimated from alignments of known codingregions and the other based on noncoding regions, and reports alikelihood ratio that quantifies whether the protein-coding model is abetter explanation while controlling for the overall level of sequenceconservation (Lin et al. Nature Precedings published online,doi:10.1038/npre.2010.4784.1).

Module Map analysis

We generated a module map of the ncRNAs versus the protein-coding genesby computing the Pearson correlations for all pairwise combinationsbased on expression across 17 different samples. This map was clusteredand visualized using the program Genomica (see URLs). For each ncRNA, wethen defined gene sets of the protein-coding genes that had a Pearsoncorrelation that was greater than or less than 0.5 with that ncRNA. Todetermine functional associations, we then generated a module map ofthese ncRNA gene sets with Gene Ontology Biological Processes gene sets(FIG. 3C) and with curated gene sets of metabolic and signaling pathwaysand biological and clinical states from the Molecular SignaturesDatabase (MSigDB c2 collection) (FIG. 12) (Subramanian et al. (2005)Proc. Natl. Acad. Sci. USA 102:15545-15550). The P value of enrichmentwas determined by the hypergeometric distribution, and a false discoveryrate (FDR) calculation was used to account for multiple hypothesistesting (P<0.05, FDR<0.05).

Tissue Samples and Cells

Informed consent was obtained for tissue donation, and we obtainedapproval from institutional review boards of Stanford University, JohnsHopkins University and Netherlands Cancer Institute. Human primarybreast tumors from The Netherlands Cancer Institute (van de Vijver etal. (2002) N. Engl. J. Med. 347:1999-2009) and normal breast tissues andmetastatic breast tumors from the Johns Hopkins University Rapid AutopsyProgram are as described (Gupta et al. (2010) Nature 464:1071-1076).Human fetal pancreata were obtained from the Birth Defects ResearchLaboratory, University of Washington. Staged fetal pancreata wereprocessed within 24 hours of receipt, minced, washed and processed forRNA isolation using standard methods. Human fetal lung fibroblasts FL3(Coriell AG04393) or foreskin fibroblasts (ATCC CRL2091) were culturedin 10% FBS (Hyclone) and 1% penicillin-streptomycin (Gibco) at 37° C. in5% CO₂.

PANDA Cloning and Sequence Analysis

3′ and 5′ RACE was performed using the FirstChoice RLM-RACE Kit(Ambion). RNA was extracted from 200 ng/ml doxorubicin (Sigma)-treatedhuman fetal lung fibroblasts, polyA was selected using the Poly(A)PuristMAG kit (Ambion) and RLM-RACE was performed according to the standardmanufacturer's protocol.

RT-PCR

Total RNA was extracted from cells using the TRIzol reagent (Invitrogen)and the RNeasy Mini Kit (Qiagen), and genomic DNA was eliminated usingTURBO DNA-free (Ambion). RT-PCR using 50-250 ng of total RNA wasperformed using the One-Step RT-PCR Master Mix (Applied Biosystems)using TaqMan Gene Expression Assays and normalized to GAPDH.Strand-specific RT-PCR for PANDA was performed using the One-Step RT-PCRMaster Mix SYBR Green (Stratagene)).

TaqMan® custom ncRNA Assays

A panel of TaqMan custom ncRNA assays was developed targeting 60 of the219 new transcribed regions using the ‘single-exon’ design mode. Thetranscript specificity and genome specificity of all TaqMan assays wereverified using a position-specific alignment matrix to predict potentialcross reactivity between designed assays and genome-wide nontargettranscripts or genomic sequences. For gene expression profiling of thesencRNAs across different conditions, complementary DNAs (cDNA) weregenerated from 50 ng of total RNA using the High Capacity cDNA ReverseTranscription Kit (Life Technologies). The resulting cDNA was subjectedto a 14-cycle PCR amplification followed by real-time PCR reaction usingthe manufacturer's TaqMan PreAmp Master Mix Kit Protocol (LifeTechnologies). Two replicates were run for each gene for each sample ina 384-well format plate on the 7900HT Fast Real-Time PCR System (LifeTechnologies). PPIA was used as an endogenous control for normalizationacross different samples.

RNA Blot

We obtained 5 μg of polyA RNA using an RNeasy Kit (QIAGEN) and PolyAPurist Mag (Ambion). RNA blots were performed using a NorthernMax Kit(Ambion) following the standard manufacturer's protocol. Probes weregenerated with full length PANDA using the Prime-It RmT Random PrimerLabeling Kit (Agilent).

Antibodies

The following antibodies were used for chromatin immunoprecipitationassays: anti-H3K4me3 (Abcam ab8580), anti-H3K35me3 (Abcam ab9050) andanti-p53 (Abcam ab28). Protein blots were performed using anti-PARP(Cell Signal 9542), anti-B-tubulin (Abcam ab6046), anti LSD1 (ab17721),anti EZH2 (Cell Signal AC22), anti p21 (Santa Cruz Biotech) and antiNF-YA (Santa Cruz Biotech H-209).

RNA Interference

Human fetal lung fibroblasts were transfected with 50 nM ofON-TARGETPlus siRNAs (Dharmacon) targeting PANDA (Table 2). ValidatedsiRNAs for mRNAs were obtained from Ambion (Table 2).

TUNEL

TUNEL assays were performed using the in situ Cell Death Detection Kit,TMR Red (Roche). Human fetal lung fibroblasts were cultured on chamberslides (Lab-Tek), treated with 200 ng/ml doxorubicin (Sigma) for 24hours, fixed with methanol at −20° C. for 10 minutes and incubated withthe TUNEL labeling mixture for 1 hour at 37° C. Slides were then washedwith PBS and mounted in Prolong Gold antifade reagent with DAPI(Invitrogen) and imaged at 20× magnification.

RNA Immunoprecipitation

Ten million cells were treated with 200 ng/ml doxorubicin for 16 hours,trypsinized and crosslinked with 1% formaldehyde for 10 minutes,followed by the addition of 0.125 M glycine for 5 minutes. After two PBSwashes, cells were lysed with 2× volume of Buffer A (10 mM HEPES pH 7.5,1.5 mM MgCl₂, 10 mM KCl, 0.5 mM DTT, 1 mM PMSF) for 15 minutes on ice at150 r.p.m. NP-40 was added to a final concentration of 0.25% for 5minutes on ice. Lysates were centrifuged for 3 minutes at 2,000 r.p.m.,and the supernatant (cytosol) was collected. Next, an equal volume ofBuffer C as that used of Buffer A was added to the pellet for 20 minuteswith frequent vortex (20 mM HEPES pH 7.5, 10% glycerol, 0.42 M KCl, 4 mMMgCl₂, 0.5 mM DTT, and 1 mM PMSF). Nuclear lysates were dounced for 5seconds using a motorized pestle and sonicated for 7 minutes using aDiagenode Sonicator (30 seconds on, 30 seconds off, power setting H).Nuclear and cytoplasmic lysates were combined and centrifuged for 15minutes at 13,000 r.p.m. Supernatants were transferred into micro spincolumns (Pierce 89879), and 2 μg of antibody was added and incubatedovernight. We washed 10 μl of Protein A/G UltraLink Resin (Pierce 53132)three times with RIP wash buffer (50 mM TrisHcl pH 7.9, 10% glycerol,100 mM KCl, 5 mM MgCl₂, 10 mM B-me and 0.1% NP-40) and added it to theimmunoprecipitation reaction for 1 hour at 4° C. Samples were washedfour times with RIP wash buffer and two times with 1 M RIPA (50 mM TrispH 7.4, 1 M NaCl, 1 mM EDTA, 0.1% SDS, 1% NP-40, 0.5% sodiumdeoxycholate, 0.5 mM DTT and 1 mM PMSF). Beads were resuspended in 200μl 150 mM RIPA (50 mM Tris pH 7.4, 150 mM NaCl, 1 mM EDTA, 0.1% SDS, 1%NP-40, 0.5% sodium deoxycholate, 0.5 mM DTT and 1 mM PMSF) plus 5 μlProteinase K (Ambion) and incubated for 1 hour at 45° C. We added 1 mlof TRIzol to the sample, and RNA was extracted using the RNEasy Mini Kit(QIAGEN) with the on column DNAse digest (QIAGEN).

RNAse Mediated RNA Chromatography

RNAse mediated RNA chromatography (Michlewski et al. (2010) RNA16:1673-1678, herein incorporated by reference in its entirety) wasperformed as previously described with the following modifications: 6pmols of RNA (PANDA or a 1.2-kb fragment of LacZ) were used perreaction. RNA was folded (90° C. for 2 minutes, ice for 2 minutes),supplied with RNA structure buffer (Ambion) and shifted to roomtemperature (22-25° C.) for 20 minutes before conjugation to beads.RNAse digestion was performed with 5 μl of RNase A/T1 cocktail (Ambion)and 2 μl of RNase V1 (Ambion). Cellular lysates were prepared asfollows: 10 million doxorubicin treated cells (16 hours) were incubatedin 200 μl PBS, 600 μl H₂0 and 200 μl nuclear lysis buffer (1.28 Msucrose; 40 mM Tris-HCl pH 7.5; 20 mM MgCl₂; 4% Triton X-100) on ice for20 minutes. Nuclei were pelleted by centrifugation at 2,500 g for 15minutes. The nuclear pellet was resuspended in 1 ml RIP buffer (150 mMKCl, 25 mM Tris pH 7.4, 0.5 mM DTT, 0.5% NP40, 1 mM PMSF and proteaseinhibitor (Roche Complete Protease Inhibitor Cocktail Tablets)).Resuspended nuclei were sheared using a motorized douncer for 5 seconds.Nuclear membrane and debris were pelleted by centrifugation at 18,000 gfor 10 minutes.

Chromatin Immunoprecipitation (ChIP)

ChIP was performed as previously described (Rinn et al. (2006) PLoSGenet. 2, e119). qPCR primers for FAS and CCNB1 and FAS-control NF-YAbinding sites were obtained from Morachis et al. (Genes Dev. (2010) 24,135-147). Primers for PUMA and BAX were designed to surround the NF-YAconsensus motif CCAAT (Table 2).

TABLE 2 Primers and Oligos RACE primers for PANDA Fwd5′-CAGAACTTGGCATGATGGAG-3′ (SEQ ID NO: 4) Rev5′-TGATATGAAACTCGGTTTACTACTAGC-3′ (SEQ ID NO: 5) Fwd25′-TGCACACATTTAACCCGAAG-3′ (SEQ ID NO: 6) Rev25′-CCCCAAAGCTACATCTATGACA-3′ (SEQ ID NO: 7) Rev35′-CGTCTCCATCAT GCCAAGTT-3′ (SEQ ID NO: 8) Rev45′-CATAGAGCTTCACCGACATAGC-3′ (SEQ ID NO: 9) RT-PCR primers for PANDA Fwd5′-TGCACACATTTAACCCGAAG-3′ (SEQ ID NO: 10) Rev5′-CCCCAAAGCTACATCTATGACA-3′ (SEQ ID NO: 11) siRNAs for PANDAsiRNA pool A 5′-AAUGUGUGCACGUAACAGAUU-3′ (SEQ ID NO: 12)5′-GAGAUUUGCAGCAGACACAUU-3′ (SEQ ID NO: 13) siRNA pool B5′-GGGCAUGUUUUCACAGAGGUU-3′ (SEQ ID NO: 14) 5′-GAGAUUUGCAGCAGACACAUU-3′(SEQ ID NO: 13) siRNA pool C 5′-AAUGUGUGCACGUAACAGAUU-3′ (SEQ ID NO: 12)5′-GGGCAUGUUUUCACAGAGGUU-3′ (SEQ ID NO: 14) siCTRL Dharmacon D-001810-10siRNAs for mRNAs siNFYA pool si9530 Ambion si9529 Ambion si9528 AmbionsiTP53 S5606 Ambion siCDKN1A S417 Ambion Chip primers PUMA fwd5′-CGT GGA TTC CTG TCT CCT CT-3′ (SEQ ID NO: 15) PUMA rev5′-GTC ACT CTG GTG AGG CGA TT-3′ (SEQ ID NO: 16) NOXA fwd5′-TTT CCC TTC CCT GTT ACT GC-3′ (SEQ ID NO: 17) NOXA rev5′-CTT GGG TAA ACA AGC CCA GA-3′ (SEQ ID NO: 18) Taqman assays PANDAcustom Taqman TP53 Hs99999147_m1 LAP3 Rh02870758_m1 APAF1 Hs00559441_m1LRDD Hs00388035_m1 FAS Hs00163653_m1 BIK Hs00154189_m1 CDKN1AHs01121168_m1 GAPDH Hs99999905_m1

Results Extensive Noncoding Transcription Near Cell-Cycle Genes

To systematically discover functional ncRNAs in the regulatory region ofhuman cell-cycle genes, we created a tiling array that interrogates at5-nucleotide resolution across 25 kb of the 9p21 locus (whichencompasses CDKN2A (p16), p14ARF and CDKN2B (p15)), as well as from 10kb upstream to 2 kb downstream of each TSS from 53 cell-cycle genes toinclude those that encode all known cyclins, CDKs and CDKIs (FIG. 1A andTable 1). These genes are also critical for fundamental biologicalprocesses such as senescence, self-renewal, DNA damage response andtumor formation (Sherr et al. (1999) Genes Dev. 13:1501-1512; Hall etal. (1996) Adv. Cancer Res. 68:67-108; Johnson et al. (1999) Annu Rev.Pharmacol. Toxicol. 39, 295-312). Thus, we hybridized 54 pairs ofpolyadenylated RNAs from various human cells that were altered orperturbed through cell-cycle synchronization, DNA damage,differentiation stimuli, oncogenic stimuli or carcinogenesis (Table 3).

A peak calling algorithm searched for statistically significant signalsabove background and detected contiguous regions (peaks) of at least 50bp. We then compiled statistically significant transcripts from all 108channels of the 54 arrays, clustered all transcripts that overlapped bya minimum of 50 bases and identified clusters that were present in atleast 10% of the samples. Averaging the signal intensity across allprobes in a peak produced a quantitative estimate of transcriptabundance. Despite possible 3′ bias caused by polyadenylated RNAselection, our procedure detected exon 1 transcription from the majorityof cell-cycle coding genes (41 of the 56), showing that this customtiling array can detect previously reported transcribed regions. In eachindividual sample, we detected an average of 73 of the 216 transcribedregions (with a range of 14-189 transcribed regions) that did notoverlap with known exons of the 56 cell-cycle genes (FIG. 9; an exampleof the CCNE1 locus in human fetal lung fibroblasts is shown in FIG. 1B).Across all 108 samples, we identified a total of 216 discretetranscribed regions (Table 4). The average transcript length was 234nucleotides (with a range of 50-1,494 nucleotides). One hundred seventyone of the 216 (79%) previously unidentified transcribed regions werelocated 5′ of the TSS of the cell-cycle genes (‘upstream’), 40 of the216 (19%) were located within introns (intronic), and 5 of the 216 (2%)were located downstream of the 3′ end of CDKN2A.

Genes actively transcribed by RNA polymerase II are marked bytrimethylation of histone H3 on lysine 4 (H3K4me3) and lysine 36 ofhistone H3 (H3K36me3), which reflect gene starts and bodies,respectively (Rando et al. (2009) Annu Rev. Biochem. 78:245-271). Thesechromatin marks can be used to identify non-coding transcription(Guttman et al. (2009) Nature 458, 223-227, herein incorporated byreference). In a subset of our samples, we determined whether the 216transcribed regions were similarly marked for active transcription byperforming chromatin immunoprecipitation followed by hybridization toour custom tiling array (ChIP-chip). This analysis confirmed that thechromatin state at a majority of the newly defined transcripts wasenriched in both H3K4me3 and H3K36me3 (FIGS. 1B and 1C). Using EpiGRAPHanalysis to query our transcripts against approximately 900 publishedgenomic attributes (Bock et al. (2009) Genome Biol. 10:R14), the 216putative transcribed regions were enriched for H3K4me3 (P<10 ⁹) and RNApolymerase II binding (P<10 ⁷), providing further evidence that thesegenomic regions are actively transcribed.

To determine whether the 216 transcripts may encode previously unknownprotein-coding exons or noncoding RNAs, we used a codon substitutionfrequency (CSF) analysis to assess for characteristic evolutionarysignatures of protein-coding sequences across 21 sequenced mammaliangenomes (Lin et al. Nature Precedings published online,doi:10.1038/npre.2010.4784.1 (18 Aug. 2010), herein incorporated byreference in its entirety). As expected, the transcribed regions thatcoincided with annotated exons had high CSF scores. However, over 86% ofthe new transcribed regions had CSF scores well below the threshold ofknown protein-coding genes and resembled known ncRNAs (FIG. 1D and Table5), suggesting that most of the new regions do not have protein-codingpotential. BLAST analysis confirmed that the majority of the transcriptsare not known protein-coding genes (Table 5). Furthermore, none of thetranscripts intersect known pre-miRNAs, C/D box small nucleolar RNAs,H/ACA box small nucleolar RNAs or small Cajal-body specific RNAs asannotated in the UCSC genome browser. Thereafter, we referred to thesetranscribed regions as long noncoding RNAs (lncRNAs). We aligned the RNAhybridization signals at all 56 protein-coding loci of all 108 samplesrelative to their TSS (FIG. 1E). As expected, we found a peakimmediately downstream of the TSS corresponding to exon 1 of theprotein-coding gene. In addition, we found enrichment of non-codingtranscription in the region 4-8 kb upstream of the TSS. Thus, unlike thepreviously described PASRs, tiny RNAs and TSSaRNAs, which are primarilylocated within 100 bp of the TSS, the majority of these ncRNAs arelonger and are not clustered immediately around the TSS.

Expression Patterns of ncRNAs Suggest Specific Biological Functions

Next, we examined the biological conditions that regulate expression ofthese ncRNAs in order to infer possible biological functions. Weassembled a matrix of the expression changes of the 216 new transcribedregions across all 54 perturbations and hierarchically clustered thegenes and samples (FIG. 2A). Of the 216 new transcribed regions, 92(43%) had at least a two-fold change in expression detected on thetiling array in at least one of the perturbations, suggesting that alarge subset of the transcribed regions may have functional roles. Thesamples that had the most transcripts with at least twofold expressionchange were the embryonic stem cells (ESC) relative to day 152 fetalpancreas (40 of 216) and invasive ductal breast carcinomas relative tonormal (as many as 35 of 216), suggesting that a subset of these lncRNAsmay play a role in self-renewal and carcinogenesis (FIG. 2A). Notably,lncRNA expression profiles of keratinocytes with knockdown of P63, whichinhibits keratinocyte differentiation, clustered with that of ESC,suggesting that these ncRNAs may have a role in the undifferentiatedstate. Expression patterns from five keratinocyte samples that weretransduced with the oncogene MYC alone or in combination with otheroncogenes relative to controls clustered together, showing that MYC hasa dominant effect on ncRNA expression. MYC-RAS-IκBα transduced humankeratinocytes activate an ESC-like mRNA gene expression program andacquire properties of cancer stem cells (Wong et al. (2008) Cell StemCell 2:333-344). Notably, the lncRNA expression profile of MYC-RAS—IκBαcells clustered with that of ESCs (FIG. 2), suggesting a shared lncRNAsignature for embryonic and cancer stem cells. In contrast, theE2F3-RAS-IκBα transduced keratinocytes, which do not express theESC-like mRNA gene expression program, had an inverse pattern ofexpression for the majority of lncRNAs. In addition, eight primary humaninvasive ductal breast carcinomas split into two different groups basedon their lncRNA profiles: four of the cancers clustered with the ESCsand MYC-RAS—IκBα tumors, and the other four clustered with theE2F3—RAS—IκBα tumors, suggesting that these tumor models mimic theexpression pattern of not only mRNAs but also these lncRNAs in bona fidehuman cancers.

The 216 lncRNAs are divided into three main clusters based on theirexpression pattern across all samples (FIG. 2). Notably, cluster 1 iscomposed of lncRNAs that are strongly induced in ESCs, keratinocyteswith P63-knockdown and MYC-RAS-IκB tumors relative to differentiatedcells and GFP-RAS-IκB tumors, which we interpret to be a ‘sternnesscluster’ (FIG. 2B). Notably, each cluster is composed of many of thencRNAs from the same genomic locus, suggesting that multiple adjacentncRNAs are either coordinately regulated in a shared response or arespliced together as exons of one transcript. High correlation of thedynamic expression patterns of these ncRNAs and different biological andcellular conditions suggest that these ncRNAs may be functional in thecell cycle, in self-renewal and in cancer.

TABLE 3 Experimental Samples And Conditions Sample pair # Experimentalsample 1 Human fetal lung fibroblasts treated with doxorubicin (200ng/μl) for 24 hours 2 Human fetal lung fibroblasts in low serum (0.01%)3 Human fetal lung fibroblasts transduced with HPV-E7 4 Human fetal lungfibroblasts transduced with HPV-E6 5 Human fetal lung fibroblaststransduced with HRAS 6 HeLa synchronized by double thymidine block: 0 hr7 HeLa synchronized by double thymidine block: 2 hr 8 HeLa synchronizedby double thymidine block: 4 hr 9 HeLa synchronized by double thymidineblock: 6 hr 10 HeLa synchronized by double thymidine block: 8 hr 11 HeLasynchronized by double thymidine block: 10 hr 12 HeLa synchronized bydouble thymidine block: 12 hr 13 Primary human keratinocytes transducedwith p63 shRNA 14 Primary human keratinocytes treated with Ca²⁺ for 48hours 15 U2OS synchronized by double thymidine block: 0 hr 16 U2OSsynchronized by double thymidine block: 2 hr 17 U2OS synchronized bydouble thymidine block: 4 hr (A) 18 U2OS synchronized by doublethymidine block: 4 hr (B) 19 U2OS synchronized by double thymidineblock: 6 hr 20 U2OS synchronized by double thymidine block: 8 hr 21 U2OSsynchronized by double thymidine block: 14 hr 22 U2OS synchronized bydouble thymidine block: 16 hr 23 Human ES (H9) 24 Human ES (H9) 25 HumanES (H9) 26 MCF7 cell line 27 Primary human keratinocytes transduced withMYC 28 Primary human keratinocytes transduced with MYC 29 Primary humankeratinocytes transduced with HRAS 30 Primary human keratinocytestransduced with HRAS 31 Primary human keratinocytes transduced with E2F332 Primary human keratinocytes transduced with E2F3 33 Primary humankeratinocytes transduced with IkB 34 Primary human keratinocytestransduced with IkB 35 Primary human keratinocytes transduced with MYC,RAS, and IkB 36 Primary human keratinocytes transduced with MYC, RAS,and IkB 37 Primary human keratinocytes transduced with E2F3, RAS, andIkB 38 Primary human keratinocytes transduced with E2F3, RAS, and IkB 39Primary human keratinocytes transduced with SOX2, RAS, and IkB 40Primary human keratinocytes transduced with SOX2, RAS, and IkB 41MYC-RAS-IkB tumor 1 42 MYC-RAS-IkB tumor 2 43 E2F3-RAS-IkB tumor 1 44E2F3-RAS-IkB tumor 2 45 Invasive ductal breast carcinoma P2 46 Invasiveductal breast carcinoma P3 47 Invasive ductal breast carcinoma P4 48Invasive ductal breast carcinoma P5 49 Invasive ductal breast carcinomaP6 50 Invasive ductal breast carcinoma P7 51 Invasive ductal breastcarcinoma P9 52 Invasive ductal breast carcinoma P10 Control samplesHuman fetal lung fibroblasts untreated Human fetal lung fibroblasts innormal serum Human fetal lung fibroblasts transduced with vector controlHuman fetal lung fibroblasts transduced with vector control Human fetallung fibroblasts transduced with vector control HeLa asynchronous HeLaasynchronous HeLa asynchronous HeLa asynchronous HeLa asynchronous HeLaasynchronous HeLa asynchronous Primary human keratinocytes transducedwith control shRNA Primary human keratinocytes without Ca2+ treatmentU2OS asynchronous U2OS asynchronous U2OS asynchronous U2OS asynchronousU2OS asynchronous U2OS asynchronous U2OS asynchronous U2OS asynchronousHuman SOX17+ definitive endoderm Human fetal pancreas day 76 Human fetalpancreas day 152 Human mammary epithelial cells Primary humankeratinocytes transduced with GFP Primary human keratinocytes transducedwith LacZ Primary human keratinocytes transduced with GFP Primary humankeratinocytes transduced with LacZ Primary human keratinocytestransduced with GFP Primary human keratinocytes transduced with LacZPrimary human keratinocytes transduced with GFP Primary humankeratinocytes transduced with LacZ Primary human keratinocytestransduced with LacZ, RAS, and IkB Primary human keratinocytestransduced with GFP, RAS, and IkB Primary human keratinocytes transducedwith LacZ, RAS, and IkB Primary human keratinocytes transduced with GFP,RAS, and IkB Primary human keratinocytes transduced with LacZ, RAS, andIkB Primary human keratinocytes transduced with GFP, RAS, and IkBGFP-RAS-IkB tumor pool GFP-RAS-IkB tumor pool GFP-RAS-IkB tumor poolGFP-RAS-IkB tumor pool Normal breast tissue Normal breast tissue Normalbreast tissue Normal breast tissue Normal breast tissue Normal breasttissue Normal breast tissue Normal breast tissue

TABLE 4 216 Identified Transcribed Regions Gene ID Unique ID NameGENE32X chr7: 92301005-92301062 int: CDK6: 143 GENE1X chr9:21921199-21921259 dst: CDKN2A: 43877 GENE82X chr16: 2417723-2417784upst: CCNF: −1721 GENE89X chr4: 78222564-78222616 upst: CCNI: −6398GENE90X chr4: 78222796-78222879 upst: CCNI: −6621 GENE91X chr4:78223171-78223226 upst: CCNI: −6883 GENE169X chr6: 36750039-36750091upst: CDKN1A: −4845 GENE139X chr17: 27834215-27834286 upst: CDK5R1:−4044 GENE140X chr17: 27833951-27834012 upst: CDK5R1: −4410 GENE105Xchr1: 1325966-1326134 upst: CCNL2: −1391 GENE106X chr1: 1326824-1326881upst: CCNL2: −2253 GENE111X chr1: 1325338-1325394 upst: CCNL2: −767GENE48X chr19: 10539307-10539365 int: CDKN2D: 1417 GENE109X chr1:1330111-1330167 upst: CCNL2: −5540 GENE40X chr6: 36756294-36756350 int:CDKN1A: 1420 GENE23X chr12: 47396450-47396506 int: CCNT1: 602 GENE107Xchr1: 1327702-1327759 upst: CCNL2: −3110 GENE142X chr17:27832743-27832798 upst: CDK5R1: −5717 GENE112X chr1: 1325562-1325622upst: CCNL2: −982 GENE80X chr8: 95977656-95977717 upst: CCNE2: −682GENE29X chr17: 27838400-27838457 int: CDK5R1: 183 GENE141X chr17:27837748-27837804 upst: CDK5R1: −482 GENE154X chr13: 25725961-25726018upst: CDK8: −798 GENE159X chr9: 129587505-129587574 upst: CDK9: −646GENE151X chr7: 92303843-92303900 upst: CDK6: −1860 GENE30X chr7:92299944-92300005 int: CDK6: 1276 GENE31X chr7: 92302482-92302535 upst:CDK6: −533 GENE202X chr1: 51198111-51198165 upst: CDKN2C: −8037 GENE99Xchr14: 99016592-99016918 upst: CCNK: −899 GENE211X chr2:96845550-96845607 upst: CNNM3: −248 GENE177X chr11: 2868308-2868438upst: CDKN1C: −4619 GENE44X chr9: 21958371-21958427 int: CDKN2A: 6667GENE5X chr9: 21979973-21980169 int: ARF: 4530 GENE188X chr9:22015449-22015506 upst: CDKN2B: −15913 GENE33X chr7: 92302831-92302887upst: CDK6: −1679 GENE166X chr6: 36753254-36753315 upst: CDKN1A: −1210GENE45X chr9: 21997580-21997644 int: CDKN2B: 1926 GENE0X chr9:21925765-21925933 dst: CDKN2A: 39498 GENE100X chr3: 158363197-158363249upst: CCNL1: −1968 GENE101X chr3: 158363410-158363460 upst: CCNL1: −2234GENE102X chr3: 158363666-158363729 upst: CCNL1: −2383 GENE103X chr3:158364054-158364109 upst: CCNL1: −2767 GENE147X chr2:219526326-219526431 upst: CDK5R2: −6418 GENE130X chr12:56440276-56440345 upst: CDK4: −7794 GENE170X chr6: 36748634-36748699upst: CDKN1A: −5830 GENE46X chr1: 51206425-51206482 int: CDKN2C: 159GENE116X chr10: 42270296-42270371 upst: CCNYL2: −36 GENE121X chr6:100131015-100131072 upst: CCNC: −6760 GENE191X chr9: 22002466-22002532upst: CDKN2B: −2817 GENE212X chr2: 96845025-96845087 upst: CNNM3: −970GENE146X chr2: 219526647-219526700 upst: CDK5R2: −6045 GENE175X chr11:2866110-2866163 upst: CDKN1C: −2196 GENE8X chr11: 69165927-69165983 int:CCND1: 874 GENE9X chr12: 4254403-4254460 int: CCND2: 1205 GENE180Xchr11: 2864269-2864326 upst: CDKN1C: −446 GENE17X chr4:78298022-78298078 int: CCNG2: 390 GENE26X chr17: 71505019-71505076 upst:CDK3: −4148 GENE57X chr4: 122964592-122964649 upst: CCNA2: −250 GENE38XchrX: 18353741-18353799 int: CDKL5: 64 GENE67X chr12: 4250076-4250132upst: CCND2: −3165 GENE20X chr14: 99017701-99017769 int: CCNK: 210GENE39X chr6: 36755581-36755639 int: CDKN1A: 885 GENE150X chr2:219523426-219523488 upst: CDK5R2: −9197 GENE49X chr2: 96847177-96847234int: CNNM3: 1459 GENE58X chr11: 69163558-69163614 upst: CCND1: −1659GENE108X chr1: 1324135-1324191 int: CCNL2: 463 GENE72X chr19:34993553-34993625 upst: CCNE1: −1190 GENE149X chr2: 219524610-219524666upst: CDK5R2: −8037 GENE164X chr5: 133731711-133731768 upst: CDKL3: −867GENE16X chr5: 162797800-162797857 int: CCNG1: 381 GENE66X chr12:4250336-4250392 upst: CCND2: −2874 GENE186X chr9: 22130077-22130138upst: CDKN2B: −130736 GENE18X chr4: 78215107-78215164 int: CCNI: 1042GENE68X chr12: 4249397-4249454 upst: CCND2: −4757 GENE36X chr9:129588503-129588560 int: CDK9: 352 GENE10X chr12: 4254887-4254947 int:CCND2: 1689 GENE37X chrX: 18355359-18355416 int: CDKL5: 1682 GENE143Xchr2: 219528834-219528890 upst: CDK5R2: −4541 GENE134X chr7:150394096-150394152 upst: CDK5: −7855 GENE157X chr9: 129586747-129586804upst: CDK9: −1536 GENE65X chr12: 4251907-4251963 upst: CCND2: −1291GENE62X chr11: 69164970-69165027 upst: CCND1: −377 GENE21X chr3:158360079-158360144 int: CCNL1: 1097 GENE145X chr2: 219532096-219532152upst: CDK5R2: −648 GENE110X chr1: 1331923-1331980 upst: CCNL2: −7336GENE60X chr11: 69162285-69162341 upst: CCND1: −2768 GENE124X chr12:54645435-54645491 upst: CDK2: −1390 GENE119X chr16: 34107323-34107380upst: CCNYL3: −8181 GENE3X chr9: 21956411-21956472 dst: CDKN2A: 8650GENE153X chr13: 25726490-25726547 upst: CDK8: −265 GENE129X chr12:56436900-56436957 upst: CDK4: −4462 GENE182X chr9: 21965167-21965225upst: CDKN2A: −44 GENE43X chr9: 21959776-21959834 int: CDKN2A: 5270GENE93X chr5: 159701961-159702018 upst: CCNJL: −2749 GENE213X chr2:96788527-96788584 upst: CNNM4: −1843 GENE148X chr2: 219525252-219525310upst: CDK5R2: −7376 GENE22X chr5: 54563848-54563906 int: CCNO: 1417GENE174X chr11: 2863821-2863879 upst: CDKN1C: −5 GENE179X chr11:2870355-2870412 upst: CDKN1C: −6280 GENE55X chr9: 21985446-21985507upst: ARF: −840 GENE64X chr12: 4251368-4251433 upst: CCND2: −1830GENE137X chr17: 27838057-27838134 upst: CDK5R1: −206 GENE56X chr13:35903573-35903645 upst: CCNA1: −1163 GENE15X chr8: 95976064-95976125int: CCNE2: 647 GENE162X chr9: 129587246-129587303 upst: CDK9: −909GENE118X chr16: 34115099-34115155 upst: CCNYL3: −293 GENE206X chr14:53933196-53933256 upst: CDKN3: −271 GENE24X chr2: 135393651-135393711int: CCNT2: 640 GENE59X chr11: 69162503-69162560 upst: CCND1: −2574GENE113X chr2: 135392543-135392600 upst: CCNT2: −319 GENE138X chr17:27835198-27835254 upst: CDK5R1: −3023 GENE158X chr9: 129585474-129585543upst: CDK9: −3159 GENE161X chr9: 129579484-129579540 upst: CDK9: −8667GENE77X chr8: 95981621-95981682 upst: CCNE2: −4956 GENE11X chr6:42015155-42015212 int: CCND3: 2384 GENE172X chr12: 12760224-12760281upst: CDKN1B: −1362 GENE92X chr4: 78224048-78224105 upst: CCNI: −7899GENE114X chr2: 135386111-135386168 upst: CCNT2: −6751 GENE28X chr7:150383947-150384009 int: CDK5: 1993 GENE160X chr9: 129579642-129579699upst: CDK9: −8509 GENE61X chr11: 69158023-69158080 upst: CCND1: −7190GENE181X chr11: 2870788-2870849 upst: CDKN1C: −7144 GENE207X chr14:53928983-53929043 upst: CDKN3: −4479 GENE7X chrX: 50041022-50041078upst: CCNB3: −3258 GENE14X chr6: 42026838-42026902 upst: CCND3: −9303GENE155X chr13: 25718422-25718479 upst: CDK8: −8337 GENE47X chr1:51206791-51206847 int: CDKN2C: 643 GENE117X chr16: 34114348-34114405upst: CCNYL3: −1019 GENE132X chr7: 150388302-150388358 upst: CDK5: −2373GENE50X chr2: 96792179-96792239 int: CNNM4: 1658 GENE79X chr8:95985217-95985273 upst: CCNE2: −8552 GENE87X chr5: 162788013-162788074upst: CCNG1: −9141 GENE69X chr12: 4248317-4248374 upst: CCND2: −4886GENE98X chr14: 99009355-99009417 upst: CCNK: −8357 GENE135X chr7:150395034-150395090 upst: CDK5: −9105 GENE183X chr9: 22108313-22108371upst: CDKN2B: −108997 GENE6X chr15: 57185158-57185215 int: CCNB2: 547GENE205X chr14: 53931175-53931231 upst: CDKN3: −2291 GENE52X chr9:21934843-21934899 dst: CDKN2A: 30203 GENE125X chr12: 54641615-54641672upst: CDK2: −5210 GENE104X chr3: 158364606-158364664 upst: CCNL1: −3430GENE85X chr16: 2415476-2415532 upst: CCNF: −3964 GENE97X chr14:99013065-99013126 upst: CCNK: −4426 GENE84X chr16: 2415715-2415845 upst:CCNF: −3743 GENE133X chr7: 150389691-150389748 upst: CDK5: −3754GENE192X chr9: 22034676-22034732 upst: CDKN2B: −35359 GENE197X chr9:22086788-22086845 upst: CDKN2B: −87467 GENE144X chr2:219527709-219527767 upst: CDK5R2: −4915 GENE83X chr16: 2417537-2417598upst: CCNF: −2075 GENE152X chr7: 92309874-92309931 upst: CDK6: −8726GENE198X chr9: 22089878-22089935 upst: CDKN2B: −90566 GENE42X chr9:21960139-21960195 int: CDKN2A: 4904 GENE41X chr9: 21960611-21960667 int:CDKN2A: 4432 GENE54X chr9: 21986800-21986856 upst: ARF: −2148 GENE185Xchr9: 22129741-22129797 upst: CDKN2B: −130339 GENE51X chr2:96836476-96836537 upst: CNNM3: −9238 GENE86X chr5: 162792622-162792678upst: CCNG1: −4532 GENE4X chr9: 21968740-21968798 int: ARF: 15754GENE81X chr16: 2418355-2418411 upst: CCNF: −1085 GENE189X chr9:22023307-22023365 upst: CDKN2B: −23831 GENE171X chr6: 36745166-36745227upst: CDKN1A: −9569 GENE19X chr4: 78214282-78214339 int: CCNI: 1874GENE2X chr9: 21919179-21919235 dst: CDKN2A: 45866 GENE25X chr6:100122671-100122732 int: CCNC: 816 GENE120X chr6: 100128825-100128888upst: CCNC: −5405 GENE127X chr12: 56434070-56434128 upst: CDK4: −1632GENE96X chr14: 99014441-99014506 upst: CCNK: −3241 GENE122X chr16:88278773-88278830 upst: CDK10: −1805 GENE95X chr5: 159700005-159700083upst: CCNJL: −671 GENE209X chr14: 53927762-53927822 upst: CDKN3: −5723GENE187X chr9: 22014430-22014488 upst: CDKN2B: −15114 GENE78X chr8:95982746-95982803 upst: CCNE2: −5939 GENE94X chr5: 159706600-159706661upst: CCNJL: −7299 GENE12X chr6: 42021792-42021848 upst: CCND3: −4248GENE35X chr9: 129589962-129590019 int: CDK9: 1811 GENE203X chr1:51197610-51197664 upst: CDKN2C: −8538 GENE53X chr9: 21985885-21985942upst: ARF: −1395 GENE70X chr12: 4246316-4246376 upst: CCND2: −6904GENE131X chr12: 56433408-56433465 upst: CDK4: −977 GENE74X chr19:34989349-34989406 upst: CCNE1: −5422 GENE76X chr8: 95979501-95979558upst: CCNE2: −2828 GENE128X chr12: 56434864-56434921 upst: CDK4: −2133GENE156X chr13: 25717125-25717182 upst: CDK8: −9630 GENE27X chr17:71504535-71504596 upst: CDK3: −4497 GENE13X chr6: 42023953-42024020upst: CCND3: −6423 GENE63X chr11: 69156135-69156265 upst: CCND1: −8918GENE184X chr9: 22119409-22119474 upst: CDKN2B: −119804 GENE208X chr14:53928196-53928270 upst: CDKN3: −5438 GENE201X chr1: 51198789-51198848upst: CDKN2C: −7397 GENE115X chr2: 208280834-208280903 upst: CCNYL1:−3709 GENE165X chr2: 39316396-39316459 upst: CDKL4: −6205 GENE168X chr6:36752378-36752435 upst: CDKN1A: −2237 GENE176X chr11: 2867781-2867838upst: CDKN1C: −4093 GENE71X chr12: 4244159-4244216 upst: CCND2: −9042GENE34X chr13: 25727477-25727542 int: CDK8: 566 GENE196X chr9:22000124-22000184 upst: CDKN2B: −804 GENE73X chr19: 34990318-34990374upst: CCNE1: −4445 GENE194X chr9: 22073848-22073931 upst: CDKN2B: −74328GENE193X chr9: 22052663-22052719 upst: CDKN2B: −53107 GENE75X chr19:34985345-34985401 upst: CCNE1: −9426 GENE200X chr1: 51202987-51203052upst: CDKN2C: −3161 GENE88X chr4: 78294528-78294586 upst: CCNG2: −2953GENE210X chr10: 101077377-101077444 upst: CNNM1: −2645 GENE167X chr6:36752597-36752655 upst: CDKN1A: −1902 GENE204X chr14: 53931488-53931551upst: CDKN3: −1974 GENE123X chr16: 88276535-88276599 upst: CDK10: −4173GENE163X chr9: 129578646-129578704 upst: CDK9: −9782 GENE178X chr11:2869277-2869335 upst: CDKN1C: −5693 GENE136X chr7: 150395804-150395862upst: CDK5: −9871 GENE215X chr2: 96785614-96785671 upst: CNNM4: −4755GENE190X chr9: 22030432-22030490 upst: CDKN2B: −31120 GENE126X chr12:54638785-54638842 upst: CDK2: −8040 GENE195X chr9: 22074531-22074588upst: CDKN2B: −75214 GENE199X chr1: 51206030-51206086 upst: CDKN2C: −127GENE173X chr11: 2864676-2864748 upst: CDKN1C: −1017 GENE214X chr2:96786532-96786600 upst: CNNM4: −3840

TABLE 5 Codon Substitution Frequency (CSF) Analysis blast result NAMEChromosome Start coordinate End coordinate CSF Score (E < 10⁻¹⁰) lengthint: CDKL5: 64 chrX 18353741 18353799 −114.14 58 int: CDKL5: 1682 chrX18355359 18355420 −59.56 61 upst: CCNB3: −3258 chrX 50041017 50041078−157.52 61 dst: CDKN2A: 45866 chr9 21919172 21919330 31.86 158 dst:CDKN2A: 43877 chr9 21921161 21921271 14.97 110 dst: CDKN2A: 39498 chr921925540 21925952 7.43 412 dst: CDKN2A: 30203 chr9 21934835 219349080.00 73 dst: CDKN2A: 8650 chr9 21956388 21956526 26.75 138 int: CDKN2A:6667 chr9 21958371 21958427 −114.20 56 int: CDKN2A: 5270 chr9 2195976821959968 −21.62 200 int: CDKN2A: 4904 chr9 21960134 21960195 −109.94 61int: CDKN2A: 4432 chr9 21960606 21960676 −164.93 70 upst: CDKN2A: −44chr9 21965167 21965225 −56.05 58 int: ARF: 15754 chr9 21968736 21968809−196.59 73 int: ARF: 4530 chr9 21979960 21980193 −91.25 233 upst: ARF:−840 chr9 21985330 21985704 −14.20 374 upst: ARF: −1395 chr9 2198588521986116 −109.72 gi|297684298| 231 ref|XP_002819782.1| upst: ARF: −2148chr9 21986638 21986856 −70.76 218 int: CDKN2B: 1926 chr9 2199738621997668 −123.40 282 upst: CDKN2B: −804 chr9 22000116 22000207 −66.49 91upst: CDKN2B: −2817 chr9 22002129 22002592 63.29 gi|13569612| 463gb|AAK31162.1| upst: CDKN2B: −15114 chr9 22014426 22014665 16.92 239upst: CDKN2B: −15913 chr9 22015225 22015826 21.20 gi|119593028| 601gb|EAW72622.1| upst: CDKN2B: −23831 chr9 22023143 22023559 16.22 416upst: CDKN2B: −31120 chr9 22030432 22030493 −79.49 61 upst: CDKN2B:−35359 chr9 22034671 22034736 −106.50 65 upst: CDKN2B: −53107 chr922052419 22052723 −37.43 304 upst: CDKN2B: −74328 chr9 22073640 22073939−124.53 299 upst: CDKN2B: −75214 chr9 22074526 22074600 −87.12 74 upst:CDKN2B: −87467 chr9 22086779 22086924 −172.53 145 upst: CDKN2B: −90566chr9 22089878 22089940 −104.21 62 upst: CDKN2B: −108997 chr9 2210830922108379 −152.16 70 upst: CDKN2B: −119804 chr9 22119116 22119482 −38.44366 upst: CDKN2B: −130339 chr9 22129651 22129797 −64.03 146 upst:CDKN2B: −130736 chr9 22130048 22130158 −27.37 110 upst: CDK9: −9782 chr9129578369 129578764 −98.67 395 upst: CDK9: −8667 chr9 129579484129579540 −69.73 56 upst: CDK9: −8509 chr9 129579642 129579703 −38.69 61upst: CDK9: −3159 chr9 129584992 129585555 −115.94 563 upst: CDK9: −1536chr9 129586615 129586808 −158.16 193 upst: CDK9: −909 chr9 129587242129587312 1.76 70 upst: CDK9: −646 chr9 129587505 129587574 50.97 69int: CDK9: 352 chr9 129588503 129588560 −74.33 57 int: CDK9: 1811 chr9129589962 129590019 −135.53 57 int: CCNE2: 647 chr8 95976013 9597674047.62 727 upst: CCNE2: −682 chr8 95977342 95978227 −68.39 885 upst:CCNE2: −2828 chr8 95979488 95979576 −72.47 88 upst: CCNE2: −4956 chr895981616 95981697 −142.19 81 upst: CCNE2: −5939 chr8 95982599 95982807−37.33 208 upst: CCNE2: −8552 chr8 95985212 95985417 −77.69 205 int:CDK6: 1276 chr7 92299872 92300181 −93.60 309 int: CDK6: 143 chr792300772 92301101 64.35 329 upst: CDK6: −533 chr7 92301681 92302693−50.77 1012 upst: CDK6: −1679 chr7 92302827 92302910 24.85 83 upst:CDK6: −1860 chr7 92303008 92304502 −4.10 gi|169171680| 1494ref|XP_001717196.1| upst: CDK6: −8726 chr7 92309874 92309931 −145.14 57int: CDK5: 1993 chr7 150383936 150384009 −120.12 73 upst: CDK5: −2373chr7 150388302 150388358 −24.52 56 upst: CDK5: −3754 chr7 150389683150389748 −134.91 65 upst: CDK5: −7855 chr7 150393784 150394611 164.48gi|297289681| 827 ref|XP_001103478.2| upst: CDK5: −9105 chr7 150395034150395090 −156.60 56 upst: CDK5: −9871 chr7 150395800 150395870 −70.3570 upst: CDKN1A: −9569 chr6 36744895 36745227 −102.36 332 upst: CDKN1A:−5830 chr6 36748634 36748699 9.40 65 upst: CDKN1A: −4845 chr6 3674961936750963 9.93 gi|1127256| 1344 pdb|1LCP|A upst: CDKN1A: −2237 chr636752227 36752462 −114.64 235 upst: CDKN1A: −1902 chr6 36752562 36752655−74.54 93 upst: CDKN1A: −1210 chr6 36753254 36753322 −170.24 68 int:CDKN1A: 885 chr6 36755349 36755717 12.28 368 int: CDKN1A: 1420 chr636755884 36756416 16.96 532 int: CCND3: 2384 chr6 42015146 4201571410.65 568 upst: CCND3: −4248 chr6 42021778 42021857 −70.48 79 upst:CCND3: −6423 chr6 42023953 42024036 −29.89 83 upst: CCND3: −9303 chr642026833 42026919 −46.71 86 int: CCNC: 816 chr6 100122595 100122744−54.98 149 upst: CCNC: −5405 chr6 100128816 100129047 7.35 231 upst:CCNC: −6760 chr6 100130171 100131105 26.17 gi|38047525| 934gb|AAR09665.1| int: CCNO: 1417 chr5 54563848 54563906 −96.83 58 upst:CDKL3: −867 chr5 133731531 133731787 −45.41 256 upst: CCNJL: −671 chr5159699848 159700083 −108.32 235 upst: CCNJL: −2749 chr5 159701926159702174 −103.49 248 upst: CCNJL: −7299 chr5 159706476 159706661 −78.26185 upst: CCNG1: −9141 chr5 162788013 162788190 −58.38 177 upst: CCNG1:−4532 chr5 162792622 162792683 −85.26 61 int: CCNG1: 381 chr5 162797535162798278 −44.97 743 int: CCNI: 1874 chr4 78214275 78214700 −107.12 425int: CCNI: 1042 chr4 78215107 78215164 −87.28 57 upst: CCNI: −6398 chr478222547 78222634 23.04 87 upst: CCNI: −6621 chr4 78222770 7822296722.93 gi|109081011| 197 ref|XP_001112542.1| upst: CCNI: −6883 chr478223032 78223226 10.14 gi|297674039| 194 ref|XP_002815047.1| upst:CCNI: −7899 chr4 78224048 78224113 2.21 65 upst: CCNG2: −2953 chr478294448 78294589 −108.22 141 int: CCNG2: 390 chr4 78297791 78298343−84.39 552 upst: CCNA2: −250 chr4 122964592 122964728 −42.04 136 int:CCNL1: 1097 chr3 158360079 158360144 −4.72 65 upst: CCNL1: −1968 chr3158363144 158363249 13.84 105 upst: CCNL1: −2234 chr3 158363410158363460 25.17 50 upst: CCNL1: −2383 chr3 158363559 158363729 45.32gi|34035| 170 emb|CAA31369.1| upst: CCNL1: −2767 chr3 158363943158364477 63.96 gi|109076165| 534 ref|XP_001084233.1| upst: CCNL1: −3430chr3 158364606 158364668 −49.03 62 upst: CDKL4: −6205 chr2 3931638239316464 −3.06 82 upst: CNNM4: −4755 chr2 96785610 96785680 −118.19 70upst: CNNM4: −3840 chr2 96786525 96786610 −99.47 85 upst: CNNM4: −1843chr2 96788522 96788595 −96.56 73 int: CNNM4: 1658 chr2 96792023 96792456−142.58 433 upst: CNNM3: −9238 chr2 96836476 96836537 −28.88 61 upst:CNNM3: −970 chr2 96844744 96845262 −21.66 gi|297266562| 518ref|XP_001098957.2| upst: CNNM3: −248 chr2 96845466 96846205 161.10gi|40068047| 739 ref|NP_951060.1| int: CNNM3: 1459 chr2 9684717396847265 −94.67 92 upst: CCNT2: −6751 chr2 135386111 135386176 3.59 65upst: CCNT2: −319 chr2 135392543 135392600 −12.28 57 int: CCNT2: 640chr2 135393502 135393737 −107.73 235 upst: CCNYL1: −3709 chr2 208280800208280910 −111.99 110 upst: CDK5R2: −9197 chr2 219523423 219523595−99.34 172 upst: CDK5R2: −8037 chr2 219524583 219524900 −187.33 317upst: CDK5R2: −7376 chr2 219525244 219525340 −142.50 96 upst: CDK5R2:−6418 chr2 219526202 219526431 40.49 gi|114688805| 229ref|XP_001152656.1| upst: CDK5R2: −6045 chr2 219526575 219526998 88.39gi|119591067| 423 gb|EAW70661.1| upst: CDK5R2: −4915 chr2 219527705219527770 −66.12 65 upst: CDK5R2: −4541 chr2 219528079 219529078 −19.83999 upst: CDK5R2: −648 chr2 219531972 219532912 160.37 gi|74005747| 940ref|XP_853120.1| int: CDKN2D: 1417 chr19 10539238 10539446 −94.96 208upst: CCNE1: −9426 chr19 34985314 34985518 −27.88 204 upst: CCNE1: −5422chr19 34989318 34989418 −142.01 100 upst: CCNE1: −4445 chr19 3499029534990379 −42.05 84 upst: CCNE1: −1190 chr19 34993550 34993681 −33.48 131upst: CDK5R1: −5717 chr17 27832500 27833788 −10.04 1288 upst: CDK5R1:−4410 chr17 27833807 27834032 −95.59 225 upst: CDK5R1: −4044 chr1727834173 27834421 −40.59 248 upst: CDK5R1: −3023 chr17 27835194 27835275−125.58 81 upst: CDK5R1: −482 chr17 27837735 27837831 45.53 96 upst:CDK5R1: −206 chr17 27838011 27838200 86.71 189 int: CDK5R1: 183 chr1727838400 27838457 40.97 57 upst: CDK3: −4497 chr17 71504516 71504720−79.37 204 upst: CDK3: −4148 chr17 71504865 71505136 −87.60 271 upst:CCNF: −3964 chr16 2415476 2415532 −138.24 56 upst: CCNF: −3743 chr162415697 2415850 −103.12 153 upst: CCNF: −2075 chr16 2417365 2417602−104.17 237 upst: CCNF: −1721 chr16 2417719 2418099 −66.81 380 upst:CCNF: −1085 chr16 2418355 2418536 −116.52 181 upst: CCNYL3: −8181 chr1634107179 34107406 −27.05 227 upst: CCNYL3: −1019 chr16 34114341 3411441029.13 69 upst: CCNYL3: −293 chr16 34115067 34115160 29.25 93 upst:CDK10: −4173 chr16 88276405 88276870 26.25 gi|119587116| 465gb|EAW66712.1| upst: CDK10: −1805 chr16 88278773 88278929 20.45 156 int:CCNB2: 547 chr15 57185158 57185327 −99.21 169 upst: CDKN3: −5723 chr1453927739 53927822 −31.89 83 upst: CDKN3: −5438 chr14 53928024 53928439−33.70 415 upst: CDKN3: −4479 chr14 53928983 53929052 8.19 69 upst:CDKN3: −2291 chr14 53931171 53931235 37.52 64 upst: CDKN3: −1974 chr1453931488 53931574 −63.44 86 upst: CDKN3: −271 chr14 53933191 53933452−53.81 261 upst: CCNK: −8357 chr14 99009134 99009421 −109.96 287 upst:CCNK: −4426 chr14 99013065 99013134 −138.67 69 upst: CCNK: −3241 chr1499014250 99014509 −98.38 259 upst: CCNK: −899 chr14 99016592 9901691813.36 326 int: CCNK: 210 chr14 99017701 99018238 −0.57 537 upst: CDK8:−9630 chr13 25717125 25717190 0.00 65 upst: CDK8: −8337 chr13 2571841825718483 −122.01 65 upst: CDK8: −798 chr13 25725957 25726089 −2.39 132upst: CDK8: −265 chr13 25726490 25726547 29.46 57 int: CDK8: 566 chr1325727321 25727803 −61.92 482 upst: CCNA1: −1163 chr13 35903469 3590407610.22 607 upst: CCND2: −9042 chr12 4244156 4244216 −60.32 60 upst:CCND2: −6904 chr12 4246294 4246385 −163.60 91 upst: CCND2: −4886 chr124248312 4248374 −81.37 62 upst: CCND2: −4757 chr12 4248441 4249910 30.541469 upst: CCND2: −3165 chr12 4250033 4250139 −118.08 106 upst: CCND2:−2874 chr12 4250324 4251151 −5.17 827 upst: CCND2: −1830 chr12 42512684251445 3.56 177 upst: CCND2: −1291 chr12 4251907 4251963 −73.06 56 int:CCND2: 1205 chr12 4254403 4254460 −16.55 57 int: CCND2: 1689 chr124254887 4254947 −12.75 60 upst: CDKN1B: −1362 chr12 12760213 12760283−162.03 70 int: CCNT1: 602 chr12 47396446 47396533 −116.72 87 upst:CDK2: −8040 chr12 54638785 54638842 −98.60 57 upst: CDK2: −5210 chr1254641615 54641672 −164.71 57 upst: CDK2: −1390 chr12 54645435 54645491−65.27 56 upst: CDK4: −977 chr12 56433408 56433465 −45.37 57 upst: CDK4:−1632 chr12 56434063 56434131 −103.47 68 upst: CDK4: −2133 chr1256434564 56435185 127.35 621 upst: CDK4: −4462 chr12 56436893 56436962−109.43 69 upst: CDK4: −7794 chr12 56440225 56440345 −116.15 120 upst:CDKN1C: −5 chr11 2863582 2864004 88.97 422 upst: CDKN1C: −446 chr112864023 2864511 −41.83 488 upst: CDKN1C: −1017 chr11 2864594 2864748−126.83 154 upst: CDKN1C: −2196 chr11 2865773 2866560 −62.22gi|119622932| 787 gb|EAX02527.1| upst: CDKN1C: −4093 chr11 28676702867845 −20.97 175 upst: CDKN1C: −4619 chr11 2868196 2868608 −34.51 412upst: CDKN1C: −5693 chr11 2869270 2869499 −30.57 229 upst: CDKN1C: −6280chr11 2869857 2870421 −8.76 564 upst: CDKN1C: −7144 chr11 28707212870849 −2.69 128 upst: CCND1: −8918 chr11 69156135 69156265 6.45 130upst: CCND1: −7190 chr11 69157863 69158125 −78.41 262 upst: CCND1: −2768chr11 69162285 69162341 −11.64 56 upst: CCND1: −2574 chr11 6916247969162560 −50.92 81 upst: CCND1: −1659 chr11 69163394 69163640 −18.16 246upst: CCND1: −377 chr11 69164676 69165039 −21.36 363 int: CCND1: 874chr11 69165927 69165983 −27.89 56 upst: CCNYL2: −36 chr10 4227020442270371 −42.89 167 upst: CNNM1: −2645 chr10 101077377 101077475 −111.9098 int: CCNL2: 463 chr1 1324108 1324220 −16.91 112 upst: CCNL2: −767chr1 1325338 1325394 −48.64 56 upst: CCNL2: −982 chr1 1325553 1325755−11.06 202 upst: CCNL2: −1391 chr1 1325962 1326156 −97.29 194 upst:CCNL2: −2253 chr1 1326824 1326881 14.19 57 upst: CCNL2: −3110 chr11327681 1327845 −69.05 164 upst: CCNL2: −5540 chr1 1330111 1330167−68.46 56 upst: CCNL2: −7336 chr1 1331907 1332072 60.70 gi|114575193|165 ref|XP_001156960.1| upst: CDKN2C: −8538 chr1 51197610 51197664−74.07 54 upst: CDKN2C: −8037 chr1 51198111 51198165 −16.04 54 upst:CDKN2C: −7397 chr1 51198751 51199005 −124.53 254 upst: CDKN2C: −3161chr1 51202987 51203052 −34.49 65 upst: CDKN2C: −127 chr1 5120602151206095 −156.53 74 int: CDKN2C: 159 chr1 51206307 51206575 −55.03gi|239741164| 268 ref|XP_002342150.1| int: CDKN2C: 643 chr1 5120679151206847 −75.40 56

A Gene Co-Expression Map Infers Trans Regulatory Mechanisms andBiological Functions

Multiple lncRNAs, including p15AS and the lncRNA upstream of CCND1, havebeen shown to regulate the transcription of the nearby coding gene. Todetermine whether gene-proximal lncRNAs are typically correlated withthe expression of the nearest mRNA, we conducted whole-genome expressionarrays on 17 samples that were also examined on our tiling array andcalculated pairwise Pearson correlations between the expression patternsof each cell-cycle promoter lncRNA versus every mRNA genome wide.Notably, there was no significant correlation or anti-correlationbetween most of the 216 lncRNAs and the nearby protein-coding mRNA,suggesting that most of the lncRNAs may not function in cis to activateor repress nearby mRNA expression (FIG. 3A). Quantitative RT-PCR(qRT-PCR) analysis of lncRNAs and neighboring 5′ and 3′ mRNAs in 34additional samples confirmed these findings (FIG. 10). In contrast, wefound that the median correlation between two ncRNAs of the same locuswas positive, supporting our hypothesis that neighboring ncRNAs may becoordinately regulated, positively regulate each other and/or are exonsof the same transcript (FIG. 3B).

Given that expression of the 216 ncRNAs does not generally correlatewith the mRNA in cis, we further explored the genes and pathways thatthey may regulate using a guilt-by-association approach (Guttman et al.(2009) Nature 458:223-227, herein incorporated by reference). For eachlncRNA, we defined a co-expression gene set as the group of mRNAs thatare positively or negatively correlated with that lncRNA across the 17samples (R>0.5 or R<0.5, respectively) (FIG. 11). We then constructed agene module map of the association of each lncRNA co-expression gene setversus the Gene Ontology Biological Processes gene set and performedbiclustering to identify lncRNAs that are associated with distinct GeneOntology terms (FIG. 3C) (Segal et al. (2004) Nat. Genet. 36:1090-1098,herein incorporated by reference in its entirety). This analysisrevealed multiple sets of lncRNAs that are associated with biologicalprocesses including cell cycle, DNA recombination, ribonucleoproteincomplex biogenesis and assembly, RNA splicing, and response to DNAdamage. Thus, despite having limited correlation in expression to theirneighboring protein-coding gene, the expression patterns of theselncRNAs are still strongly related to the cell cycle. We constructed asimilar module map with curated gene sets of metabolic and signalingpathways as well as biological and clinical states from the MolecularSignatures Database (MSigDB c2 collection) (Subramanian et al. (2005)Proc. Natl. Acad. Sci. USA 102, 15545-15550). This module map confirmedthe enrichment for cell-cycle-related sets (for example, Cell CycleBrentani or Cell Cycle KEGG). In addition, enriched modules includedseveral poor prognosis breast cancer gene sets (BRCA estrogen receptornegative, BRCA prognosis negative and BRCA1 overexpressed up),DNA-damage-related gene sets (UVA/UVB), several oncogenic signatures.

Validation of ncRNA Expression in Cell Cycle, ESC Differentiation,Cancer and DNA Damage Response

To validate these inferred functional associations, we designed qRT-PCRassays for 60 of the 216 new transcribed regions (53 upstream and 7intronic) to obtain a more quantitative measure of these lncRNAs acrossdifferent conditions. Expression in HeLa cells synchronized in cellcycle progression by double thymidine block showed that most of thelncRNA have periodic expression peaking at different phases of the cellcycle (FIG. 4A) (Whitfield et al. (2002) Mol. Biol. Cell 13:1977-2000).Parallel analysis in primary human fibroblasts synchronized by serumstimulation confirmed the peak cell cycle phase of 74% of the lncRNAswith periodic expression pattern during the cell cycle (FIG. 4B). Next,comparison of human ESCs and fetal pancreas at days 76 and 152 showedthat a majority of these lncRNAs are regulated during differentiation(FIG. 4C). In addition, unsupervised clustering of lncRNA expressionpatterns in five metastatic breast cancers and five normal mammarytissues readily distinguished the five metastatic breast cancers fromthe normal mammary tissues (FIG. 4D). Some of the lncRNAs, includingupst:CCNL1:−2,767 and int:CDKN1A:+885 (Table 3), are repressed in themetastatic breast cancers relative to normal mammary tissues, whereasothers, including upst:CDKN1A:−4,845, upst:CDKN2B:−2,817 andint:ARF:+4,517, are induced. Thus, the majority of these lncRNAs hasperiodic expression in the cell cycle and is differentially expressed indifferent states of cell differentiation and cancer progression.

Our co-expression maps predicted associations of several lncRNAs withDNA damage response pathways (FIG. 3C and FIG. 11). In support of thisfinding, doxorubicin-treated human fetal lung fibroblasts showed atleast two-fold change in 12 of the 216 ncRNAs on the tiling array and byqRT-PCR (FIG. 2). Notably, 2 of those 12 ncRNAs were located 5′ of theTSS of the canonical p53 target gene CDKN1A (upst:CDKN1A:−1,210 andupst:CDKN1A:−4,845), and, similar to the CDKN1A mRNA, were induced bydoxorubicin (FIG. 5A). In addition, a third lncRNA at the CDKN1A locus,upst:CDKN1A:−800, was also induced by doxorubicin but was not includedin the 216 lncRNAs because it was only expressed in one of the 108samples, the doxorubicin-treated fibroblasts. In order to confirmwhether these lncRNAs may be responsive to DNA damage, we measured theexpression changes of 60 lncRNAs predicted in the DNA damage pathway (aswell as upst:CDKN1A:−800) by quantitative RT-PCR in human fetal lungfibroblasts treated with doxorubicin over a 24 hour time course. Most ofthe lncRNAs were either markedly induced or repressed by doxorubicin,and all five of the tested lncRNAs surrounding the CDKN1A TSS wereinduced, including the three that were previously detected on the tilingarray (FIG. 5B). Notably, several lncRNAs upstream of CDKN1A are inducedmore rapidly and with substantially higher magnitude than CDKN1A uponDNA damage. Upst:CDKN1A:−4,845 is induced up to 40-fold upon DNA damage(FIG. 5C). These variations in expression patterns within the same locussuggest that the lncRNAs in the CDKN1A locus may play distinct roles inthe DNA damage response from the CDKN1A protein, p21.

PANDA: a Long ncRNA Involved in the DNA-Damage Response

To investigate the functional relevance of these lncRNAs at the CDKN1Alocus, we selected upst:CDKN1A:−4,845 (SEQ ID NO:1), hereafter termedPANDA (P21 associated ncRNA DNA damage activated), for further analysis.PANDA is located approximately 5 kb upstream of the CDKN1A TSS,coincides with a cluster of previously annotated expressed sequence tagsand is evolutionarily conserved (FIG. 13). Although the PANDA locusintersects a computationally predicted pseudogene of LAP3, qRT-PCRshowed that PANDA was specifically induced by DNA damage, whereas LAP3expression did not significantly change, confirming that the change inexpression detected by the tiling array was not caused by crosshybridization with LAP3 (FIG. 14). Furthermore, the CSF score of PANDA,9.3, indicated very low protein-coding potential compared to LAP3 (witha CSF range of 117-1,343 for its 13 exons). Rapid amplification of the5′ and 3′ complementary DNA ends (RACE, SEQ ID NO:2 and SEQ ID NO:3) andRNA blot analysis revealed a 1.5-kb transcript that is divergentlytranscribed from CDKN1A, antisense of the predicted LAP3 pseudogene(FIG. 5D). Thus, PANDA is a 5′-capped and polyadenylated non-splicedlncRNA that is transcribed antisense to CDKN1A.

Because p53 is a positive regulator of CDKN1A during the DNA damageresponse, we asked whether p53 also regulates PANDA expression.ChIP-chip analysis confirmed the p53 binding site immediately upstreamof the CDKN1A TSS (FIG. 5A) (Wei et al. (2006) Cell 124:207-219). PANDAand CDKN1A are diametrically situated 2.5 kb from this intervening p53binding site, which supports the possibility of p53 co-regulation.Indeed, siRNA-mediated knockdown of p53 before DNA damage inhibited theinduction of PANDA by 70% 24 hours after DNA damage (FIG. 5E and FIG.15), which is similar to its effect on CDKN1A. In contrast, RNAinterference of CDKN1A had no effect on PANDA expression, indicatingthat PANDA is not a linked transcript of CDKN1A nor is PANDA expressiondependent on p21. PANDA level shows a trend of lower expression in humanprimary breast tumors harboring an inactivating mutation in TP53 asdetermined by exon 2-11 DNA sequencing (FIG. 16A) (Geisler et al. (2001)Cancer Res. 61, 2505-2512). Further, complementation of p53-null H1299lung carcinoma cells by wild-type p53—but not the loss-of-function p53(p.Val272Cys) mutant—restored DNA damage-inducible expression of PANDA(FIG. 5F). Notably, a gain-of-function p53 (p.Arg273His) mutant,observed in Li-Fraumeni syndrome (Olive et al. (2004) Cell 119,847-860), abrogated the ability to induce CDKN1A but selectivelypreserved the ability to induce PANDA (FIG. 5F). We also observedselective induction of PANDA without concordant CDKN1A expression inmetastatic ductal carcinomas but not in normal breast tissue (FIG. 16B).

Next, we addressed whether PANDA affects the DNA damage response. Wetransduced human fetal fibroblasts (FL3) with custom siRNAs targetingPANDA and then applied doxorubicin for 24 hours following the knockdown(FIG. 6A). Global gene expression analysis showed that 224 genes wereinduced and 193 genes were repressed at least twofold by PANDA knockdown(FIG. 6B). Genes induced by PANDA knockdown were significantly enrichedfor those involved in apoptosis, such as the Gene Ontology terms ‘celldeath’ (P<0.04) and ‘apoptosis’ (P<0.03) (FIG. 6B). qRT-PCR confirmedthat PANDA depletion induced several genes encoding canonical activatorsof apoptosis, including APAF1, BIK, FAS and LRDD (FIG. 6C). On the otherhand, expression of neither CDKN1A itself nor TP53 was affected by PANDAdepletion (FIG. 6D), suggesting that PANDA is a P53 effector that actsindependently of p21CDKN1A.

DNA damage in human fibroblasts triggers p53-dependent G1 arrest but notapoptosis (Agarwal et al. (1995) Proc. Natl. Acad. Sci. USA92:8493-8497; Di Leonardo et al. (1994) Genes Dev. 8, 2540-2551).Consistent with this finding, doxorubicin treatment in FL3 cells exposedto control siRNA had little to no apoptosis as measured by TUNEL. Incontrast, PANDA knockdown resulted in fivefold to sevenfold increasedTUNEL-positive cells (FIGS. 6E and 6F). Immunoblot analysis of PARP, acaspase substrate and marker of apoptosis, revealed PARP cleavage onlyin PANDA-depleted cells (FIG. 6G). In contrast, six additional siRNAstargeting other transcripts within the CDKN1A promoter had no effect onapoptosis (data not shown; FIG. 17). Thus, PANDA knockdown sensitizedfibroblasts to DNA-damage-induced apoptosis. Altogether, these datasuggest that in parallel with p53-mediated induction of CDKN1A for cellcycle arrest, p53-mediated induction of PANDA delimits apoptosis.

Core promoters of cell death genes downstream of p53 are distinguishedfrom other p53 target genes by the binding site for the transcriptionfactor NF-YA (Morachis et al. (2010) Genes Dev. 24:135-147), and wereasoned that PANDA may affect NF-YA function. RNA chromatography(Michlewski (2010) RNA 16:1673-1678, herein incorporated by reference)using purified, in vitro transcribed PANDA RNA, but not a 1.2-kb LacZmRNA fragment, specifically retrieved NF-YA from cellular lysates ofhuman fibroblasts induced by DNA damage (FIG. 7A). PANDA did notretrieve other chromatin modification complexes that can bind otherlncRNAs, such as EZH2 or LSD1 (Khalil et al. (2009) Proc. Natl. Acad.Sci. USA 106:11667-11672; Tsai et al. (2010) Science 329:689-693), orp21, illustrating the specificity of the interaction.Immunoprecipitation of NF-YA from doxorubicin-treated primary human lungfibroblasts specifically retrieved endogenous PANDA (FIG. 7B). NF-YA isa nuclear transcription factor that activates the p53-responsivepromoter of FAS upon DNA damage (Morachis et al. (2010) Genes Dev.24:135-147). Depletion of PANDA substantially increased NF-YA occupancyat target genes, including CCNB1, FAS, BBC3 (also known as PUMA) andPMAIP1 (also known as NOXA) (FIG. 7C). Moreover, concomitant knockdownof NF-YA and PANDA substantially attenuated induction of apoptotic genesand apoptosis as measured by TUNEL, indicating that NF-YA is required inpart for cell death triggered by loss of PANDA (FIGS. 7D and E). Thus,PANDA binding to NF-YA may evict or prevent NF-YA binding to chromatin.These data suggest that DNA damage activates p53-mediated transcriptionat CDKN1A and PANDA that functions synergistically to mediate cell cyclearrest and survival. CDKN1A mRNA produces p21 to mediate arrest, whereasPANDA impedes NF-YA activation of apoptotic gene expression program(FIG. 8).

Discussion

Recent studies have revealed that a surprisingly large fraction ofmammalian genomes is transcribed. In addition to small noncoding RNAs,long noncoding RNAs can be produced from gene promoters and enhancers,as well as stand-alone intergenic loci (Guttman et al. (2009) Nature458:223-227; Katayama et al. (2005) Science 309:1564-1566; and De Santaet al. (2010) PLoS Biol. 8, e1000384). New approaches are needed thatnot only identify ncRNAs but also provide insight into their potentialbiological function.

Using an ultrahigh-resolution tiling array, we interrogated thetranscriptional landscape at cell-cycle promoters in 108 samples thatrepresent diverse perturbations. The ability to interrogate numerous anddiverse biological samples in a rapid and economical fashion isadvantageous for at least two reasons. First, many of the noncodingtranscripts are induced only in highly specific conditions and may havebeen missed if only a few conditions were surveyed. Of the 216 newnoncoding transcribed regions we identified, on average, only 73 ofthese are transcribed in any one biological sample. Second, comparisonof lncRNA profiles amongst these diverse samples highlighted unexpectedsimilarities in cell cycle promoter states among distinct perturbations.For instance, we identified a similarity of promoter states among ESCs,tumors induced by MYC and epithelial progenitors depleted of thedifferentiation regulator p63. Likewise, authentic human tumors can beclassified based on the similarity of their promoter states to those ofcells with defined oncogenic perturbation.

Noncoding transcription through regulatory elements may affect geneactivity in a variety of ways. The act of transcription may opencompacted chromatin over regulatory sequences or compete withtranscription factor binding (so called transcriptional interference).In addition, the ncRNA product may modulate neighboring gene expressionin cis (Lee (2009) Genes Dev. 23:1831-1842; Kanhere et al. (2010) Mol.Cell. 38:675-688), affect distantly located genes in trans (Rinn et al.(2007) Cell 129, 1311-1323) or even serve as a target for regulation bysmall regulatory RNAs (Han et al. (2007) Proc. Natl. Acad. Sci. USA 104,12422-12427; Schwartz et al. (2008) Nat. Struct. Mol. Biol. 15,842-848).

Because these different mechanisms predict distinct relationshipsbetween levels of ncRNAs and cognate mRNAs, we compared ncRNA and mRNAexpression profiles across our samples. We found that most promoterncRNAs are neither positively nor negatively correlated in expressionwith their neighboring mRNA but are rather correlated in expression withgenes located elsewhere in the genome. The genes co-expressed (andpresumably co-regulated) with promoter ncRNAs function in specificbiological pathways, including cell cycle, DNA damage response and stemcell differentiation, and have been associated with cancer prognosis.Quantitative RT-PCR analysis further validated that many of these ncRNAsare differentially expressed in the cell cycle and in human cancers, andare regulated in response to DNA damage or ESC differentiation. Thesefindings suggest that cell-cycle ncRNAs may participate in generegulation in trans. In addition, noncoding transcription of cell-cyclepromoters may be a form of regulatory anticipation or feedback tomodulate the chromatin state of cell-cycle promoters.

Our results suggest that the human genome is organized into genomicunits that code for multiple transcripts that function in the samebiological pathways (FIG. 8). Forty nine of 56 cell-cycle protein-codinggene loci have at least one detected lncRNA and an average of fourlncRNAs within 10 kb upstream and 2 kb downstream of the TSS. At theCDKN1A promoter, five lncRNAs, similar to the CDKN1A mRNA itself, areinduced by DNA damage. One of these lncRNAs, which we named PANDA, is anon-spliced 1.5-kb ncRNA that is transcribed antisense to CDKN1A and isinduced with faster kinetics than CDKN1A. Loss-of-function andcomplementation experiments show that PANDA induction during DNA damageis p53 dependent. In contrast, depletion of CDKN1A or depletion of PANDAhad no effect on the other's response to DNA damage, indicating thattheir induction by p53 occurs in parallel. PANDA inhibits the expressionof apoptotic genes by sequestering the transcription factor NF-YA fromoccupying target gene promoters. Whereas CDKN1A encodes a cell cycleinhibitor to mediate cell cycle arrest, PANDA promotes cell survival byimpeding the apoptotic gene expression program. This linkage can beapparently exploited by tumors: the ability of the Li-Fraumenigain-of-function p53 mutant R273H to selectively retain PANDA inductioninstead of CDKN1A in effect uncouples cell survival from cell cyclearrest, which was similarly observed in metastatic ductal carcinomas.Thus, lncRNAs like PANDA may provide new explanations for human cancersusceptibility.

Intriguingly, a recent study identified a distinct long intergenicnoncoding RNA located 15 kb upstream of CDKN1A, named lincRNA-p21, thatis induced by p53 and mediates p53-dependent gene repression (Huarte etal. (2010) Cell 142:409-419). Thus, the regulatory sequence upstream ofCDKN1A drives the expression of multiple coding and noncodingtranscripts that cooperate to regulate the DNA damage response (FIG. 8).These findings provide a vivid example that shows the blurring boundarybetween ‘genes’ and ‘regulatory sequences’ (Mattick (2003) Bioessays25:930-939).

Our study provides an initial catalog of lncRNAs in cell-cycle promotersthat may play diverse functions. At a minimum, promoter ncRNA expressionprovides a convenient means of tracking the chromatin state ofpromoters, which may be of use in cancer biology and regenerativemedicine. Future studies are needed to pinpoint the functions of theseand likely other ncRNAs emanating from regulatory sequences.

While the preferred embodiments of the invention have been illustratedand described, it will be appreciated that various changes can be madetherein without departing from the spirit and scope of the invention.

What is claimed is:
 1. A method for diagnosing cancer in a subject, themethod comprising: a) measuring the level of a plurality of biomarkersin a biological sample derived from the subject, wherein the pluralityof biomarkers comprises one or more long non-coding RNAs (lncRNAs)selected from the group consisting of upst:CCNL1:−2767, int:CDKN1A:+885,upst: CDKN1A: −4845, upst:CDKN2B:−2,817, upst:CDK9:−9782,int:ARF:+4,517, int:ARF:+4530, upst:CDKN1C:−1017, int:CCNG1:+381, andupst:CCNG2:−2953; and b) analyzing the levels of the biomarkers inconjunction with respective reference value ranges for said plurality ofbiomarkers, wherein differential expression of one or more biomarkers inthe biological sample compared to one or more biomarkers in a controlsample from a normal subject indicates that the subject has cancer. 2.The method of claim 1, comprising measuring the levels ofupst:CCNL1:−2767, int:CDKN1A:+885, upst: CDKN1A: −4845,upst:CDKN2B:−2,817, upst:CDK9:−9782, int:ARF:+4,517, int:ARF:+4530,upst:CDKN1C:−1017, int:CCNG1:+381, and upst:CCNG2:−2953.
 3. The methodof claim 1, wherein the subject is a human being.
 4. The method of claim1, wherein the biological sample is a biopsy comprising cells from atumor.
 5. The method of claim 1, wherein the cancer is breast cancer. 6.The method of claim 5, wherein the breast cancer is metastatic ductalcarcinoma.
 7. The method of claim 1, wherein measuring the level of theplurality of biomarkers comprises performing microarray analysis,polymerase chain reaction (PCR), reverse transcriptase polymerase chainreaction (RT-PCR), a Northern blot, serial analysis of gene expression(SAGE), immunoassay, or mass spectrometry.
 8. The method of claim 7,wherein microarray analysis is performed using a microarray comprising aplurality of probes that hybridize to upst:CCNL1:−2767, int:CDKN1A:+885,upst: CDKN1A: −4845, upst:CDKN2B:−2,817, upst:CDK9:−9782,int:ARF:+4,517, int:ARF:+4530, upst:CDKN1C:−1017, int:CCNG1:+381, andupst:CCNG2:−2953.
 9. The method of claim 1, comprising measuring thelevel of upst: CDKN1A: −4845, wherein increased expression of upst:CDKN1A: −4845 compared to a reference value indicates the subject hascancer.
 10. The method of claim 9, wherein measuring the level of upst:CDKN1A: −4845 comprises performing polymerase chain reaction (PCR) usingat least one set of oligonucleotide primers comprising a forward primerand a reverse primer capable of amplifying a upst: CDKN1A: −4845polynucleotide sequence, wherein at least one set of primers is selectedfrom the group consisting of: a) a forward primer comprising thesequence of SEQ ID NO:4 and a reverse primer comprising the sequence ofSEQ ID NO:5; b) a forward primer comprising the sequence of SEQ ID NO:6and a reverse primer comprising the sequence of SEQ ID NO:7; c) aforward primer comprising the sequence of SEQ ID NO:8 and a reverseprimer comprising the sequence of SEQ ID NO:9; and d) a forward primercomprising the sequence of SEQ ID NO:10 and a reverse primer comprisingthe sequence of SEQ ID NO:11.
 11. A method for monitoring tissueregeneration in a subject, the method comprising: a) measuring the levelof a plurality of biomarkers in a biological sample derived from thesubject, wherein the plurality of biomarkers comprises one or more longnon-coding RNAs (lncRNAs) selected from the group consisting ofupst:CCNG2:−2953, upst: CDKN1A: −4845, upst: CDKN1A: −9569,upst:CCNL1:−2767, int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530,upst:CDKN1C:−1017; and b) analyzing the levels of the biomarkers inconjunction with respective reference value ranges for said plurality ofbiomarkers, wherein differential expression of one or more biomarkers inthe biological sample compared to one or more biomarkers in a controlsample indicates that the tissue is regenerating.
 12. The method ofclaim 11, wherein the subject is a human being.
 13. The method of claim11, wherein the biological sample is a biopsy comprising cells from theregenerating tissue.
 14. A biomarker panel comprising a plurality ofbiomarkers, wherein one or more biomarkers are long non-coding RNAs(lncRNAs) selected from the group consisting of int:CDK6:143,dst:CDKN2A:43877, upst:CCNF:−1721, upst:CCNI:−6398, upst:CCNI:−6621,upst:CCNI:−6883, upst:CDKN1A:−4845, upst:CDK5R1:−4044,upst:CDK5R1:−4410, upst:CCNL2:−1391, upst:CCNL2:−2253, upst:CCNL2:−767,int:CDKN2D:1417, upst:CCNL2:−5540, int:CDKN1A:1420, int:CCNT1:602,upst:CCNL2:−3110, upst:CDK5R1:−5717, upst:CCNL2:−982, upst:CCNE2:−682,int:CDK5R1:183, upst:CDK5R1:−482, upst:CDK8:−798, upst:CDK9:−646,upst:CDK6:−1860, int:CDK6:1276, upst:CDK6:−533, upst:CDKN2C:−8037,upst:CCNK:−899, upst:CNNM3:−248, upst:CDKN1C:−4619, int:CDKN2A:6667,int:ARF:4530, upst:CDKN2B:−15913, upst:CDK6:−1679, upst:CDKN1A:−1210,int:CDKN2B:1926, dst:CDKN2A:39498, upst:CCNL1:−1968, upst:CCNL1:−2234,upst:CCNL1:−2383, upst:CCNL1:−2767, upst:CDK5R2:−6418, upst:CDK4:−7794,upst:CDKN1A:−5830, int:CDKN2C:159, upst:CCNYL2:−36, upst:CCNC:−6760,upst:CDKN2B:−2817, upst:CNNM3:−970, upst:CDK5R2:−6045,upst:CDKN1C:−2196, int:CCND1:874, int:CCND2:1205, upst:CDKN1C:−446,int:CCNG2:390, upst:CDK3:−4148, upst:CCNA2:−250, int:CDKL5:64,upst:CCND2: 3165, int:CCNK:210, int:CDKN1A:885, upst:CDK5R2:−9197,int:CNNM3:1459, upst:CCND1:−1659, int:CCNL2:463, upst:CCNE1:−1190,upst:CDK5R2:−8037, upst:CDKL3:−867, int:CCNG1:381, upst:CCND2:−2874,upst:CDKN2B:−130736, int:CCNI:1042, upst:CCND2:−4757, int:CDK9:352,int:CCND2:1689, int:CDKL5:1682, upst:CDK5R2:−4541, upst:CDK5:−7855,upst:CDK9:−1536, upst:CCND2:−1291, upst:CCND1:−377, int:CCNL1:1097,upst:CDK5R2:−648, upst:CCNL2:−7336, upst:CCND1:−2768, upst:CDK2:−1390,upst:CCNYL3:−8181, dst:CDKN2A:8650, upst:CDK8:−265, upst:CDK4:−4462,upst:CDKN2A:−44, int:CDKN2A:5270, upst:CCNJL:−2749, upst:CNNM4:−1843,upst:CDK5R2:−7376, int:CCNO:1417, upst:CDKN1C:−5, upst:CDKN1C:−6280,upst:ARF:−840, upst:CCND2:−1830, upst:CDK5R1:−206, upst:CCNA1:−1163,int:CCNE2:647, upst:CDK9:−909, upst:CCNYL3:−293, upst:CDKN3:−271,int:CCNT2:640, upst:CCND1:−2574, upst:CCNT2:−319, upst:CDK5R1:−3023,upst:CDK9:−3159, upst:CDK9:−8667, upst:CCNE2:−4956, int:CCND3:2384,upst:CDKN1B:−1362, upst:CCNI:−7899, upst:CCNT2:−6751, int:CDK5:1993,upst:CDK9:−8509, upst:CCND1:−7190, upst:CDKN1C:−7144, upst:CDKN3:−4479,upst:CCNB3:−3258, upst:CCND3:−9303, upst:CDK8:−8337, int:CDKN2C:643,upst:CCNYL3:−1019, upst:CDK5:−2373, int:CNNM4:1658, upst:CCNE2:−8552,upst:CCNG1:−9141, upst:CCND2:−4886, upst:CCNK:−8357, upst:CDK5:−9105,upst:CDKN2B:−108997, int:CCNB2:547, upst:CDKN3:−2291, dst:CDKN2A:30203,upst:CDK2:−5210, upst:CCNL1:−3430, upst:CCNF:−3964, upst:CCNK:−4426,upst:CCNF:−3743, upst:CDK5:−3754, upst:CDKN2B:−35359,upst:CDKN2B:−87467, upst:CDK5R2:−4915, upst:CCNF:−2075, upst:CDK6:−8726,upst:CDKN2B:−90566, int:CDKN2A:4904, int:CDKN2A:4432, upst:ARF:−2148,upst:CDKN2B:−130339, upst:CNNM3:−9238, upst:CCNG1:−4532, int:ARF:15754,upst:CCNF:−1085, upst:CDKN2B:−23831, upst:CDKN1A:−9569, int:CCNI:1874,dst:CDKN2A:45866, int:CCNC:816, upst:CCNC:−5405, upst:CDK4:−1632,upst:CCNK:−3241, upst:CDK10:−1805, upst:CCNJL:−671, upst:CDKN3:−5723,upst:CDKN2B:−15114, upst:CCNE2:−5939, upst:CCNJL:−7299,upst:CCND3:−4248, int:CDK9:1811, upst:CDKN2C:−8538, upst:ARF:−1395,upst:CCND2:−6904, upst:CDK4:−977, upst:CCNE1:−5422, upst:CCNE2:−2828,upst:CDK4:−2133, upst:CDK8:−9630, upst:CDK3:−4497, upst:CCND3:−6423,upst:CCND1:−8918, upst:CDKN2B:−119804, upst:CDKN3:−5438,upst:CDKN2C:−7397, upst:CCNYL1:−3709, upst:CDKL4:−6205,upst:CDKN1A:−2237, upst:CDKN1C:−4093, upst:CCND2:−9042, int:CDK8:566,upst:CDKN2B:−804, upst:CCNE1:−4445, upst:CDKN2B:−74328,upst:CDKN2B:−53107, upst:CCNE1:−9426, upst:CDKN2C:−3161,upst:CCNG2:−2953, upst:CNNM1:−2645, upst:CDKN1A:−1902, upst:CDKN3:−1974,upst:CDK10:−4173, upst:CDK9:−9782, upst:CDKN1C:−5693, upst:CDK5:−9871,upst:CNNM4:−4755, upst:CDKN2B:−31120, upst:CDK2:−8040,upst:CDKN2B:−75214, upst:CDKN2C:−127, upst:CDKN1C:−1017, andupst:CNNM4:−3840.
 15. The biomarker panel of claim 14, comprisingupst:CCNG2:−2953, upst: CDKN1A: −4845, upst: CDKN1A: −9569,upst:CCNL1:−2767, int:CCNG1:+381, upst:CDK9:−9782, int:ARF:+4530, andupst:CDKN1C:−1017.
 16. An assay comprising: a) measuring the amount ofPANDA in a biological sample derived from a subject; and b) comparingthe amount of PANDA with a reference value, and if the amount of PANDAis increased relative to the reference value, identifying the subject ashaving an increased probability of having cancer.
 17. The method ofclaim 16, wherein the cancer is metastatic ductal carcinoma.
 18. Themethod of claim 16, wherein measuring the amount of PANDA in abiological sample comprises performing polymerase chain reaction (PCR)using at least one set of oligonucleotide primers comprising a forwardprimer and a reverse primer capable of amplifying a PANDA polynucleotidesequence, wherein at least one set of primers is selected from the groupconsisting of: a) a forward primer comprising the sequence of SEQ IDNO:4 and a reverse primer comprising the sequence of SEQ ID NO:5; b) aforward primer comprising the sequence of SEQ ID NO:6 and a reverseprimer comprising the sequence of SEQ ID NO:7; c) a forward primercomprising the sequence of SEQ ID NO:8 and a reverse primer comprisingthe sequence of SEQ ID NO:9; and d) a forward primer comprising thesequence of SEQ ID NO:10 and a reverse primer comprising the sequence ofSEQ ID NO:11.
 19. A method of selecting a treatment regimen for asubject suspected of having cancer, the method comprising: a) measuringthe amount of PANDA in a biological sample derived from the subject; b)analyzing the amount of PANDA in conjunction with respective referencevalue ranges for PANDA, wherein an increased amount of PANDA in thebiological sample compared to a control sample indicates that thesubject has cancer; and c) selecting an anti-tumor treatment regimen forthe subject based on the stage of disease progression in the subject.20. The method of claim 19, wherein the anti-tumor treatment regimencomprises administering to a subject in need thereof a therapeuticallyeffective amount of a chemotherapeutic agent in combination with atherapeutically effective amount of one or more PANDA inhibitors,wherein the one or more PANDA inhibitors are selected from the groupconsisting of a small interfering RNA (siRNA), a microRNA (miRNA), aPiwi-interacting RNA (piRNA), a small nuclear RNA (snRNA), and anantisense oligonucleotide.
 21. The method of claim 20, wherein at leastone siRNA comprising a nucleotide sequence selected from the groupconsisting of SEQ ID NOS:12-14 is administered.
 22. The method of claim19, wherein measuring the amount of PANDA in a biological samplecomprises performing polymerase chain reaction (PCR) with at least oneset of oligonucleotide primers comprising a forward primer and a reverseprimer capable of amplifying a PANDA polynucleotide sequence, wherein atleast one set of primers is selected from the group consisting of: a) aforward primer comprising the sequence of SEQ ID NO:4 and a reverseprimer comprising the sequence of SEQ ID NO:5; b) a forward primercomprising the sequence of SEQ ID NO:6 and a reverse primer comprisingthe sequence of SEQ ID NO:7; c) a forward primer comprising the sequenceof SEQ ID NO:8 and a reverse primer comprising the sequence of SEQ IDNO:9; and d) a forward primer comprising the sequence of SEQ ID NO:10and a reverse primer comprising the sequence of SEQ ID NO:11.
 23. Themethod of claim 19, further comprising measuring the levels of one ormore biomarkers selected from the group consisting of upst:CCNL1:−2767,int:CDKN1A:+885, upst: CDKN1A: −4845, upst:CDKN2B:−2,817,upst:CDK9:−9782, int:ARF:+4,517, int:ARF:+4530, upst:CDKN1C:−1017,int:CCNG1:+381, and upst:CCNG2:−2953; and analyzing the levels of thebiomarkers in conjunction with respective reference value ranges forsaid one or more biomarkers, wherein differential expression of one ormore biomarkers in the biological sample compared to said one or morebiomarkers in a control sample indicates that that the subject hascancer.